[Feature] Support Deepseek-VL2 #2798

ccw1996 · 2025-01-08T13:52:01Z

Motivation

Add Deepseek-VL2 model to SGLang, as requested in #2653

Modifications

Add new model Deepseek-VL2

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

ispobock · 2025-01-08T15:55:38Z

python/sglang/srt/models/deepseekvlv2.py

@@ -0,0 +1,127 @@
+from typing import List,Optional,Tuple,Union


rename the file to deepseek_vl2?

rename done

ispobock · 2025-01-08T15:59:31Z

python/sglang/srt/models/deepseekvlv2.py

+
+        self.layers = modules
+
+    def forward(self, x):


add unit test in https://github.com/sgl-project/sglang/blob/main/test/srt/test_vision_openai_server.py

I have not yet implemented the forward part of the DeepseekV2ForCausalLM. I will finish all the implementations and add the unit test this weekend.

yizhang2077 · 2025-01-25T13:23:27Z

@ccw1996 Do you need our help?

SashvDave · 2025-01-29T18:27:18Z

Has support for deepseek vl2 been implemented?

Edenzzzz · 2025-01-30T01:45:13Z

python/sglang/srt/models/deepseekvl2.py

+        if config.projector_type == "downsample_mlp_gelu":
+            mlp_depth = config.depth
+            mlp_ratio = config.mlp_ratio
+            modules = [nn.Linear(config.input_dim * config.downsample_ratio * config.downsample_ratio, config.n_embed * mlp_ratio)]
+            for _ in range(1, mlp_depth - 1):
+                modules.append(nn.GELU())
+                modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed * mlp_ratio))
+            modules.append(nn.GELU())
+            modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed))
+            modules = nn.Sequential(*modules)


@ccw1996 I'm happy to take the rest of the work to parallelize the remaining functions. Could you give me access to your branch?

Edenzzzz · 2025-01-30T02:16:16Z

@ccw1996 Apologies for the delay. Would you like me to help with the rest of it?

ccw1996 · 2025-01-31T13:55:57Z

@ccw1996 Do you need our help?

@ccw1996 Apologies for the delay. Would you like me to help with the rest of it?

sure, i have some trouble about adapting preprocess. i need help to adapt siglip implement without timm

i will update my other implement code later

Edenzzzz · 2025-01-31T16:10:01Z

@ccw1996 I see, I think you can copy those layers from timm into python/sglang/srt/models/deepseekvl2.py, and then replace layers with sgl classes. I'm interested in helping if you can give me access.

…allelize

Edenzzzz · 2025-02-04T04:42:52Z

@yizhang2077 @ispobock Looks like we'll have to copy lots of code from timm--now mostly just the linear layers with variable depth to parallelize, will finish soon

ccw1996 · 2025-02-06T16:14:55Z

@ccw1996 Apologies for the delay. Would you like me to help with the rest of it?

@Edenzzzz hello, i have run deepseekvl2 success with timm preprocess, but i am confused that result have some unexpected value. Can you help me find out the reason?

Edenzzzz · 2025-02-06T16:16:32Z

@Edenzzzz hello, i have run deepseekvl2 success with timm preprocess, but i am confused that result have some unexpected value. Can you help me find out the reason?

Sure, can you mark the problematic part?

Edenzzzz · 2025-02-06T16:17:20Z

python/sglang/srt/models/deepseek_vl2.py

+        if config.projector_type == "downsample_mlp_gelu":
+            mlp_depth = config.depth
+            mlp_ratio = config.mlp_ratio
+            modules = [
+                nn.Linear(
+                    config.input_dim
+                    * config.downsample_ratio
+                    * config.downsample_ratio,
+                    config.n_embed * mlp_ratio,
+                )
+            ]
+            for _ in range(1, mlp_depth - 1):
+                modules.append(nn.GELU())
+                modules.append(
+                    nn.Linear(config.n_embed * mlp_ratio, config.n_embed * mlp_ratio)
+                )
+            modules.append(nn.GELU())
+            modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed))
+            modules = nn.Sequential(*modules)
+


Need to parallelize this part with Column and Row linear

@yizhang2077 Actually with GELU we'll have to gather output for each TP linear. Should we use replicated linear instead?

ccw1996 · 2025-02-09T14:21:33Z

@Edenzzzz hello, i have run deepseekvl2 success with timm preprocess, but i am confused that result have some unexpected value. Can you help me find out the reason?
Sure, can you mark the problematic part?

two problem. one is radix cache will make input error, i will try to fix it. the second is output seems like not use images embedding. Can you help me to debug it?

Edenzzzz · 2025-02-09T21:59:52Z

Let me try tomorrow

Edenzzzz · 2025-02-10T01:06:23Z

python/sglang/srt/models/deepseek_vl2.py

+                        input_embeds[idx].masked_scatter_(
+                            image_seq_mask[idx].unsqueeze(-1), images_in_this_batch
+                        )


@ccw1996 The image embedding (images_in_this_batch) is indeed applied to the text embedding here.

@Edenzzzz thanks a lot. Now it can output right answer. I will finish cuda graph and clean code in this weekend.

Edenzzzz · 2025-02-11T14:41:06Z

python/sglang/srt/model_executor/model_runner.py

+                logger.info(
+                    "Automatically turn off --chunked-prefill-size and disable radix cache for deekseek-vl2."
+                )
+                server_args.chunked_prefill_size = -1
+                server_args.disable_radix_cache = True


The language part still supports radix cache.

The language part relay on input embed. If use radix cache, the input embed is wrong. I will try to debug it.

I see, I think you're right. Llava and qwen_vl also don't use radix attn

Edenzzzz · 2025-02-13T17:24:42Z

test/srt/test_vision_openai_server.py

+            ],
+        )
+        cls.base_url += "/v1"
+
 if __name__ == "__main__":
    unittest.main()


@ccw1996 This seems mostly ready. Did you encounter 400 Bad Request when running Qwen-VL?

I don't know if qwen-vl is normal， i tested qwen2-vl and passed

The tests have not passed. We should test deepseek-vl2, not qwen-vl. There's some dim mismatch in capturing cuda graph. You can try to fix it and then it should be ready

sorry, i fix these error in latest commit. now it can test pass.

ccw1996 · 2025-02-14T12:03:19Z

@Edenzzzz Can you help me merge all the commits? Now, it's ready. Thanks a lot

Edenzzzz · 2025-02-15T17:27:19Z

python/sglang/srt/models/deepseek_vl2.py

+            modules = ReplicatedLinear(
+                config.input_dim,
+                config.n_embed,
+                quant_config=quant_config,
+            )
+
+        elif config.projector_type == "mlp_gelu":
+            mlp_depth = config.depth
+            modules = [ReplicatedLinear(
+                config.input_dim,
+                config.n_embed,
+                quant_config=quant_config,
+            )]
+            for _ in range(1, mlp_depth):
+                modules.append(nn.GELU())
+                modules.append(
+                    ReplicatedLinear(
+                        config.n_embed,
+                        config.n_embed,
+                        quant_config=quant_config,
+                    )
+                )
+            modules = nn.Sequential(*modules)


There are still bugs when running the test. Replaced linear layers, we need to take out the first element of the output tuple

[WIP] initial deepseekvlv2

b6e13d0

zhyncs added the enhancement New feature or request label Jan 8, 2025

zhyncs assigned ispobock and zhyncs Jan 8, 2025

ispobock reviewed Jan 8, 2025

View reviewed changes

ispobock assigned yizhang2077 Jan 8, 2025

adapt siglip and part of languange model

e3aa9f3

Edenzzzz reviewed Jan 30, 2025

View reviewed changes

ccw1996 and others added 2 commits February 1, 2025 22:45

🚧 adapt deepseekvl2 inference, work on preprocess

c9441bb

Add some layers to timm; sequential mlp with var. depth remain to par…

ce085fd

…allelize

🚧 error result but can success running with timm

1d7fd28

Edenzzzz reviewed Feb 6, 2025

View reviewed changes

Edenzzzz and others added 2 commits February 6, 2025 23:45

Remove radix attn; parallelize mostly done

73b03b1

format input token

1a4b3d5

Merge branch 'main' into main

1d0fd8b

Edenzzzz reviewed Feb 10, 2025

View reviewed changes

fix inference value fault

79f2a24

Edenzzzz reviewed Feb 11, 2025

View reviewed changes

ccw1996 added 2 commits February 12, 2025 22:57

support cuda graph and clean code

375b0dc

add deepseekvl2 test case

8d047ad

Edenzzzz and others added 2 commits February 12, 2025 17:11

Merge branch 'main' into main

1737b81

fix batch_size>1 and test pass

e3631e2

Edenzzzz reviewed Feb 13, 2025

View reviewed changes

Edenzzzz and others added 4 commits February 13, 2025 11:28

Merge branch 'main' into main

6a9bdcf

Fix typos

0801533

Fix typo

d6cfb30

add supported_model

1ae6256

ccw1996 changed the title ~~[WIP] [Feature] Support Deepseek-VL2~~ [Feature] Support Deepseek-VL2 Feb 14, 2025

ccw1996 marked this pull request as ready for review February 14, 2025 12:01

ccw1996 requested review from merrymercy, Ying1123, zhyncs, hnyls2002 and ByronHsu as code owners February 14, 2025 12:01

Edenzzzz and others added 3 commits February 14, 2025 15:53

Fix typos

b021e06

Fix missing logic

1601af4

fix model config register

4ee9a1d

Edenzzzz reviewed Feb 15, 2025

View reviewed changes

Partial fix for linear output

0fa537e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Deepseek-VL2 #2798

[Feature] Support Deepseek-VL2 #2798

ccw1996 commented Jan 8, 2025 •

edited

Loading

ispobock Jan 8, 2025

ccw1996 Jan 12, 2025 •

edited

Loading

ispobock Jan 8, 2025

ccw1996 Jan 9, 2025

yizhang2077 commented Jan 25, 2025

SashvDave commented Jan 29, 2025 •

edited

Loading

Edenzzzz Jan 30, 2025 •

edited

Loading

Edenzzzz commented Jan 30, 2025

ccw1996 commented Jan 31, 2025

Edenzzzz commented Jan 31, 2025 •

edited

Loading

Edenzzzz commented Feb 4, 2025

ccw1996 commented Feb 6, 2025

Edenzzzz commented Feb 6, 2025 •

edited

Loading

Edenzzzz Feb 6, 2025

Edenzzzz Feb 6, 2025

ccw1996 commented Feb 9, 2025

Edenzzzz commented Feb 9, 2025

Edenzzzz Feb 10, 2025 •

edited

Loading

ccw1996 Feb 11, 2025

Edenzzzz Feb 11, 2025

ccw1996 Feb 11, 2025 •

edited

Loading

Edenzzzz Feb 11, 2025 •

edited

Loading

Edenzzzz Feb 13, 2025

ccw1996 Feb 14, 2025

Edenzzzz Feb 14, 2025

ccw1996 Feb 15, 2025

ccw1996 commented Feb 14, 2025

Edenzzzz Feb 15, 2025

		@@ -0,0 +1,127 @@
		from typing import List,Optional,Tuple,Union

[Feature] Support Deepseek-VL2 #2798

Are you sure you want to change the base?

[Feature] Support Deepseek-VL2 #2798

Conversation

ccw1996 commented Jan 8, 2025 • edited Loading

Motivation

Modifications

Checklist

Choose a reason for hiding this comment

ccw1996 Jan 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yizhang2077 commented Jan 25, 2025

SashvDave commented Jan 29, 2025 • edited Loading

Edenzzzz Jan 30, 2025 • edited Loading

Choose a reason for hiding this comment

Edenzzzz commented Jan 30, 2025

ccw1996 commented Jan 31, 2025

Edenzzzz commented Jan 31, 2025 • edited Loading

Edenzzzz commented Feb 4, 2025

ccw1996 commented Feb 6, 2025

Edenzzzz commented Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccw1996 commented Feb 9, 2025

Edenzzzz commented Feb 9, 2025

Edenzzzz Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccw1996 Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

Edenzzzz Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccw1996 commented Feb 14, 2025

Choose a reason for hiding this comment

ccw1996 commented Jan 8, 2025 •

edited

Loading

ccw1996 Jan 12, 2025 •

edited

Loading

SashvDave commented Jan 29, 2025 •

edited

Loading

Edenzzzz Jan 30, 2025 •

edited

Loading

Edenzzzz commented Jan 31, 2025 •

edited

Loading

Edenzzzz commented Feb 6, 2025 •

edited

Loading

Edenzzzz Feb 10, 2025 •

edited

Loading

ccw1996 Feb 11, 2025 •

edited

Loading

Edenzzzz Feb 11, 2025 •

edited

Loading