Skip to content

Conversation

@Bihan
Copy link
Collaborator

@BihanBihan commented Dec 20, 2025

Steps To Test

Step1: Create replica-groups-service.yml

# replica-groups-service.yml type: service name: replica-groups-test python: 3.12 replicas: - name: replica-1 # replica name is optional count: 0..2 scaling: metric: rps target: 2 commands: - echo "Group 1 - Version 0" > /tmp/version.txt - python3 -m http.server 8000 resources: cpu: 2 - name: replica-2 count: 0..3 scaling: metric: rps target: 2 commands: - echo "Group 2 - Version 0" > /tmp/version.txt - python3 -m http.server 8000 resources: cpu: 2 port: 8000 

Step2: dstack apply -f replica-groups-service.yml

Step3: Run load_test_replica_groups.py by subsituting your URL and TOKEN

import asyncio import aiohttp import time # ==== Configuration ==== URL = "<URL>" TOKEN = "<TOKEN>" RPS = 8 # Requests per second DURATION = 1800 # Duration in seconds METHOD = "GET" # or "POST" # ======================= HEADERS ={"Content-Type": "application/json", "Authorization": f"Bearer{TOKEN}" } async def send_request(session, idx): """Send a request and print response""" try: async with session.request(METHOD, URL, headers=HEADERS) as resp: text = await resp.text() print(f"\n[{idx}] Status:{resp.status}") # print small part of response (HTML preview) print(text[:200].strip(), "...\n") except Exception as e: print(f"[{idx}] Error:{e}") async def run_load_test(): total_requests = RPS * DURATION interval = 1.0 / RPS async with aiohttp.ClientSession() as session: start_time = time.perf_counter() tasks = [] for i in range(total_requests): task = asyncio.create_task(send_request(session, i + 1)) tasks.append(task) await asyncio.sleep(interval) await asyncio.gather(*tasks) elapsed = time.perf_counter() - start_time print(f"\n✅ Sent{total_requests} requests in{elapsed:.2f}s " f"(~{total_requests/elapsed:.2f} RPS)") if __name__ == "__main__": asyncio.run(run_load_test()) 

Expected Output
Each group gets one replica

Submit the run replica-groups-test? [y/n]: y NAME BACKEND GPU PRICE STATUS SUBMITTED replica-groups-test - - running 07:31 group=0 replica=0 aws (us-east-2) - $0.0832 running 07:32 group=1 replica=1 aws (us-east-2) - $0.0832 running 07:32 

Later, both groups scale respecting group configs.
group0 scales to 2 replicas,
and group1 scales to 3.

Below is the expected output

NAME BACKEND GPU PRICE STATUS SUBMITTED replica-groups-test - - running 9 mins ago group=0 replica=0 aws (us-east-2) - $0.0832 running 8 mins ago replica=2 aws (us-east-2) - $0.0832 running 3 mins ago group=1 replica=1 aws (us-east-2) - $0.0832 running 8 mins ago replica=3 aws (us-east-2) - $0.0832 running 3 mins ago replica=4 aws (us-east-2) - $0.0832 running 3 mins ago 

Step4: Check whether replica specific commands were executed.
Attach to the desired replica
Eg:
dstack attach -replica 2 replica-groups-test
ssh replica-groups-test-0-2 'cat /tmp/version.txt'
output: Group 1 - Version 0

Step5: Check rolling deployment.
Important:
Rolling deployments are currently affected by a race condition that also impacts the non–replica group implementation and must be addressed separately (issue). However, when each replica group is configured with a single replica, this race condition does not affect rolling deployments.

Testing instructions:

Scale down each replica group to 1 replica.

Restart the load-testing script with RPS = 2.

After all groups have scaled down to a single replica, re-apply the configuration:

Re-apply
dstack apply -f replica-groups-service.yml

Active run replica-groups-test already exists. Detected changes that can be updated in-place: - Configuration properties: - replica_groups Update the run? [y/n]: y NAME BACKEND GPU PRICE STATUS SUBMITTED replica-groups-test - - running 07:51 group=0 replica=0 aws (us-east-2) - $0.0832 terminated 07:51 replica=2 aws (us-east-2) - $0.0832 running 07:53 group=1 replica=1 aws (us-east-2) - $0.0832 terminated 07:51 replica=3 aws (us-east-2) - $0.0832 running 07:53 

@Bihan
Copy link
CollaboratorAuthor

Will be solving merge conflicts as review continues.

@Bihan
Copy link
CollaboratorAuthor

Related PRs

#3205 from @DragonStuff

@peterschmidt85
Copy link
Contributor

peterschmidt85 commented Dec 20, 2025

@Bihan Do we really need replica group names?

@peterschmidt85
Copy link
Contributor

@Bihan Also please check the conflicts with master.

@peterschmidt85
Copy link
Contributor

Cosmetics only: I would rename replica_groups to replicas and also rename replicas under replica_groups to count.

@Bihan
Copy link
CollaboratorAuthor

Cosmetics only: I would rename replica_groups to replicas and also rename replicas under replica_groups to count.

Yes. will rename it.

@Bihan
Copy link
CollaboratorAuthor

@Bihan Do we really need replica group names?

Yes.

Without replica names, we would rely on indices, which are position-dependent. If groups are reordered by users during manual scaling, indices shift, but existing jobs and persisted state (like desired_replica_counts) still reference the old positions. This mismatch prevents reliable identification of which group a job belongs to, leading to incorrect scaling decisions. Replica names are not affected by reordering in the YAML file.

Initial

replica_groups: - replicas: 1 # Index 0 is prefill commands: ["--disaggregation-mode prefill"] - replicas: 2 # Index 1 is decode commands: ["--disaggregation-mode decode"] 

Manual Scaling

replica_groups: - replicas: 3 # Index 0 now becomes decode commands: ["--disaggregation-mode decode"] - replicas: 2 # Index 1 now becomes prefill commands: ["--disaggregation-mode prefill"] 

Instead of relying on replica group's position in the config, another possibility is matching job specs to identify replicas; but this approach fails during rolling deployments because old and new jobs from the same group have different specs.

@peterschmidt85
Copy link
Contributor

As a user I find it unnecessary to give names. I would prefer not to ask names if this is possible technically.
Why not simply treat indices as names. If the user for some reason changes the order, we re-deploy?

@Bihan
Copy link
CollaboratorAuthor

As a user I find it unnecessary to give names. I would prefer not to ask names if this is possible technically. Why not simply treat indices as names. If the user for some reason changes the order, we re-deploy?

If a user changes commands for group 0 and reorders groups at the same time, they expect a rolling deployment for group 0 only. However, the system detects the order change and triggers a full redeployment for all groups. Users may find this implicit behavior annoying because it provisions extra instances for each groups.

@peterschmidt85
Copy link
Contributor

Perhaps we could make these names optional?

@Bihan
Copy link
CollaboratorAuthor

Perhaps we could make these names optional?

Yes, we can make it optional.

Bihan Rana added 2 commits December 23, 2025 06:59
add_replica_groups_model Replica Groups AutoScaling Rolling deployment and UI Replica Groups implementation clean up
@BihanBihanforce-pushed the add_replica_groups_in_dstack_service branch from 86139c5 to 5abbcadCompareDecember 23, 2025 09:29
)


classReplicaGroup(ConfigurationWithCommandsParams, CoreModel):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) I think we could allow to set many more properties per replica group. If the user can set commands, they may also want to set entrypoint, working_dir, image, volumes, repos, etc. And if the user can set resources, they may also want to set instance_types, spot_policy, reservation, etc.

Although it may be a good idea to leave this to a future iteration, because some properties may be non-trivial to support correctly

Comment on lines 588 to 592
new_job_specs=awaitget_job_specs_from_run_spec(
run_spec=run_spec,
run_spec=group_run_spec,
secrets=secrets,
replica_num=replica_num,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Job specs from new_job_specs all have replica_group set to "default", which doesn't match the job specs of existing jobs. This prevents in-place.

To reproduce

Apply this conf:

name: testport: 8000priority: 1replicas: - name: r1count: 1commands: - sleep infinity

Then apply this conf:

name: testport: 8000priority: 2replicas: - name: r1count: 1commands: - sleep infinity

Expected

Changing priority does not cause replica redeployment — the job's deployment_num is updated in-place.

Actual

The job is redeployed.

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

run_spec=spec,
secrets=secrets,
replica_num=0,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, all job specs returned by /api/project/{project_name}/runs/get_plan always have replica_group="default", even if the configuration specifies other replica groups. This doesn't affect UX right now, but it will be a problem once we try to implement per-group offers display in the run plan as you suggested earlier

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

Comment on lines 535 to 545
jobs=awaitget_jobs_from_run_spec(
run_spec=group_run_spec,
secrets=secrets,
replica_num=global_replica_num,
)

forjobinjobs:
assertreplica_group.nameisnotNone, (
"ReplicaGroup name should be set by validator"
)
job.job_spec.replica_group=replica_group.name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed this pattern in several places in the PR — you create the job using get_jobs_from_run_spec, then mutate its job_spec.replica_group.

Mutating job_spec.replica_group is not an obvious responsibility, so it is very easy to forget doing so. For example, the issues from my two previous comments (1, 2) appear to be caused by not settingreplica_group properly.

Maybe get_jobs_from_run_spec could accept replica_group_name as an argument? The function would then pass it down the call stack, possibly until it reaches JobConfigurator._get_job_spec, where it would be assigned to the job spec when constructing it. This would remove the implicit responsibility of setting job_spec.replica_group after constructing the job

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right way. Thanks for the idea. Will do it accordingly.

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented.

Comment on lines 531 to 539
group_run_spec=create_group_run_spec(
base_run_spec=run_spec,
replica_group=replica_group,
)
jobs=awaitget_jobs_from_run_spec(
run_spec=group_run_spec,
secrets=secrets,
replica_num=global_replica_num,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern can be seen in many places in the PR — you are converting the run spec to a group run spec, then passing it to get_jobs_from_run_spec. Similarly to my other comment, this pattern introduces an implicit responsibility for get_jobs_from_run_spec callers, which can be error-prone.

Also, the fact that RunSpec can now be used for both runs and replica groups makes it a bit more difficult to understand the code — now, when you see a RunSpec object, you can't always be sure if it's actually a run spec or a replica group spec.

Maybe we could avoid this conversion from run spec to group run spec throughout the codebase? For example, if get_jobs_from_run_spec will have the replica_group_name parameter and pass it on to the job configurator as I suggested earlier, maybe it could be the job configurator's responsibility to consider both the run spec and the replica group params when building job specs?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remove the conversion from run spec to group run spec throughout the code base, which can be error-prone. I plan to update JobConfigurator and ServiceJobConfigurator as below.

class JobConfigurator(ABC): TYPE: RunConfigurationType _image_config: Optional[ImageConfig] = None # JobSSHKey should be shared for all jobs in a replica for inter-node communication. _job_ssh_key: Optional[JobSSHKey] = None def __init__( self, run_spec: RunSpec, secrets: Optional[Dict[str, str]] = None, replica_group_name: Optional[str] = None, # <- Pass replica group name to configurator ): self.run_spec = run_spec self.secrets = secrets or{} self.replica_group_name = replica_group_name # <- Store for use in _shell_commands() and _requirements() ... async def _get_job_spec( self, replica_num: int, job_num: int, jobs_per_replica: int, ) -> JobSpec: job_spec = JobSpec( replica_num=replica_num, job_num=job_num, job_name=f"{self.run_spec.run_name}-{job_num}-{replica_num}", jobs_per_replica=jobs_per_replica, replica_group=self.replica_group_name or "default", # <- Set replica_group # ... other fields ... ) return job_spec def _requirements(self, jobs_per_replica: int) -> Requirements: resources = self.run_spec.configuration.resources # Look up replica group and use its resources if type service. if self.run_spec.configuration.type == "service": for group in self.run_spec.configuration.replica_groups: if group.name == self.replica_group_name: resources = group.resources # Use group-specific resources break spot_policy = self._spot_policy() return Requirements( resources=resources, max_price=self.run_spec.merged_profile.max_price, spot=None if spot_policy == SpotPolicy.AUTO else (spot_policy == SpotPolicy.SPOT), reservation=self.run_spec.merged_profile.reservation, multinode=jobs_per_replica > 1, ) class ServiceJobConfigurator(JobConfigurator): TYPE: RunConfigurationType = RunConfigurationType.SERVICE def _shell_commands(self) -> List[str]: assert self.run_spec.configuration.type == "service" # Look up replica group and use its commands. for group in self.run_spec.configuration.replica_groups: if group.name == self.replica_group_name: return group.commands # Use group-specific commands # This shouldn't happen - raise error raise ValueError(f"Replica group '{self.replica_group_name}' not found in service configuration. ") 

@peterschmidt85
Copy link
Contributor

@Bihan the test seem to be broken. Have you seen it?

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

@Bihan@peterschmidt85@jvstme