Troubleshoot | Diagnosing instance-wide performance #

You could encounter different kinds of instance-wide issues on your DSS instance.

For example, you might encounter multiple jobs appearing to slow down when executed on your DSS node. This is most commonly the case when there are multiple Python processes running on the same instance, including Jupyter notebooks.

You can prevent this situation by ensuring that cgroups are set up on your instance.

If cgroups are not yet enabled on your instance, a single Python process can consume as much RAM and CPU as possible on the instance. Let’s say that there are multiple Python processes running on the same instance. An individual Python job will still be able to consume as much RAM and CPU as is available to it, but the available resources might be considerably lower at certain times, depending on what else is running on the DSS instance.

While a job may still run and complete successfully, this competition for resources between concurrently running Python processes can increase the runtime of each job if they are now bottlenecked by the available resources on the instance.

In addition, if one user is running an extremely intensive job and the instance does not have cgroups set up, this single job can cause slowdown for all other jobs executed on the instance at the same time.

You can take several steps to prevent this situation: