Not all parameters are described in the article, but only those that you usually need to change when configuring new CloudStack clouds. The parameters for the system are applied after reload of each CloudStack Management server.
CloudStack version: 4.11.1
The parameters described in the article can be configured at: Global Settings
Global Settings View
alert.email.addresses
- Proposed value: e-mail
- A list of e-mail addresses to which important system messages are sent.
alert.email.sender
- Proposed value: e-mail
- E-mail sender of error messages, for example, noreply@mycloud.com.
alert.smtp.*
- A family of options for configuring SMTP when sending important system messages.
allow.public.user.templates
- Proposed value: false
- If you do not plan to create an environment in which users can share their templates, the best practice is to set it to false. Otherwise, the environment can turn into a “trash”, and inattentive users can disclose important data to third parties by marking templates as public.
allow.user.create.projects
- Proposed value: false
- Defines if regular users are allowed to create projects. It may be a harmful option if set to true as projects track resources separately from accounts. Thus, with the improperly designed billing, users can hide machines in projects and use them for free. Another way to limit users’ capabilities in the creation of projects is to disable project creation via dynamic roles which works well if you plan to maintain another users groups who can create projects.
allow.user.expunge.recover.vm
- Proposed value: true
- By default, when a user deletes a VM, it stays destroyed and gets completely deleted (expunged) via
expunge.delay
. If this flag is enabled, then upon VM deleting, a user can immediately indicate that the VM should be completely deleted. Likewise, with this parameter, a user can recover the machine which is in the destroyed state. When working with this parameter, you need to pay attention toexpunge.delay
,expunge.interval
,expunge.workers
, because if large values are specified users can consume a lot of limited cluster resources (for example, public IPv4) with such machines.
allow.user.view.destroyed.vm
- Proposed value: true
- Determines whether a user can see the virtual machines that are not completely deleted (destroyed). If
allow.user.expunge.recover.vm
is set to true, then it is logical to set this parameter to true as well.
api.allowed.source.cidr.list
- Proposed value for a private cloud: corporate network CIDR
- Proposed value for a private cloud: by default
- Specifies from which IPs it is possible to send requests to the API. As management servers are often deployed behind Nginx, you can leave this option as default and perform fine-tuning at the Nginx level.
api.throttling.enabled
- Proposed value: true
- This parameter specifies whether the CloudStack server will limit the number of API calls per each user. If a cloud is available to an unlimited range of users, it is worth setting it to true. In any case, it is recommended that you do this also for Nginx if it fences the management server from external access.
api.throttling.interval
- Proposed value: 1
- The parameter specifies the interval within which the limitation is made. In our case, it is 1 second, that means we limit the number of requests per second.
api.throttling.max
- Proposed value: 500
- The parameter specifies how many requests can be sent within
api.throttling.interval
before the API is blocked in the interval. It is not recommended setting it to a value less than 500, as the UI can be unstable.
capacity.check.period
- Proposed value: 300000
- Determines a period of time when recalculation of available resources for CPU, RAM, storage is performed for the cloud. In this case, the recalculation occurs once in 300 seconds. If the cloud is small and the resources are in short supply, then it is recommended to reduce this parameter to avoid the situation when the machines are stopped, but the resources are not released yet. Basically, this parameter is important only together with the
capacity.skipcounting.hours
parameter, which allows “returning” resources of the stopped machines to the cloud. Thus, the stopped machines (while being stopped) do not use the CPU and RAM resources and that can be taken into account in billing. However, these machines may never start if the resources have already been used by other machines.
capacity.skipcounting.hours
- Proposed value: 3600
- This parameter is semantically incorrect since the time is actually must be specified in seconds, not hours. However, it determines after what time the stopped machine is considered not using the CPU and RAM of the cloud. This can be useful in case of resource shortage in the cloud, however, some machines remain stopped. For example, the test environments can be started and stopped on demand.
cluster.cpu.allocated.capacity.disablethreshold
This is a default parameter for global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.cpu.allocated.capacity.notificationthreshold
This is a default parameter for global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.localStorage.capacity.notificationthreshold
This is a default parameter for global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.memory.allocated.capacity.disablethreshold
By default, this parameter belongs to the global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.memory.allocated.capacity.notificationthreshold
This is a default parameter for global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.storage.allocated.capacity.notificationthreshold
This is a default parameter for global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.storage.capacity.notificationthreshold
This is a default parameter for global configuration variables at the cluster level. All created clusters will inherit this parameter.
cluster.threshold.enabled
- Proposed value: true
- The parameter determines if the warning and exception levels for the node selection algorithm will be used in clusters.
consoleproxy.service.offering
- Proposed value: uuid
- The parameter specifies which service offering to use for a CPVM. You must specify the UUID of the service offering. For small or private clouds, configuring this parameter does not make sense, because by default the resources of the service offering for the CPVM that CloudStack uses are sufficient. However, if you want to provide more CPVM cores or place it on specific nodes, then this parameter allows you to do this.
consoleproxy.sslEnabled
- Proposed value: true
- The parameter specifies whether a CPVM will use SSL for the virtual console. If you plan to host the management server behind https, you must set it to true. Otherwise, when you call the console for virtual machines, you will always receive a security error. You can upload the certificate chain in the Infrastructure section by clicking the SSL Certificate button. If this is not done, the access to the VM consoles with
consoleproxy.sslEnabled
enabled will not work correctly. SSL certificate must be a wildcard one.
consoleproxy.url.domain
- Proposed value: *.domain.com
- The CPVM is almost a regular VM that runs in a cluster. The management server gets to know which guest IP the CPVM uses and, when accessing a virtual machine console, it provides the FQDN in the form of 1-2-3-4.domain.com, where 1-2-3-4 is the guest ipv4 of the CPVM. Thus, you need to provide a support for wildcard domain * .domain.com for correct operation of the CPVM.
custom.diskoffering.size.max
- If you plan to use disk offerings of the Custom Disk Size type, then this parameter must be set to a reasonable value that the disks cannot exceed, e.g. 1024 (1 TB).
custom.diskoffering.size.min
- If you plan to use disk offerings of the Custom Disk Size type, then this parameter must be set to a reasonable value that the disks cannot be less of, e.g. 10 (10 GB).
expunge.delay
- Proposed value: 300
- The parameter determines after what time the machines in the “destroyed” state will be expunged.
expunge.interval
- Proposed value: 60
- The parameter specifies how often the expunging procedure for virtual machines is executed to finally remove the VMs marked as “destroyed”.
expunge.workers
- Proposed value: 1
- The parameter determines how many threads will delete the machines.
host
The parameter is used to specify the IP address of the CloudStack management server that is used for agents connection. This parameter may need changing in several cases:
- when using an address that is balanced between management servers, for example, using
Keepalived
; - if the CloudStack server identified its address incorrectly after the installation and the agents connect to the wrong addresses.
host.capacityType.to.order.clusters
- Proposed value: CPU
- This parameter is used by FirstFitAlocator to sort nodes by the less amount of used resources. You can specify CPU or RAM. If CPU is specified, virtual machines will be created on hosts with a minimum allocated CPU (allocated, not utilized). If RAM is specified virtual machines will be created on nodes with the minimum allocated memory.
instance.name
- Proposed value: vm
- Defines the prefix of the name of the newly created VM if a name is unset. As a result, the machine name will be
vm-<uuid>
.
kvm.snapshot.enabled
- Proposed value: true
- Enables the support for full snapshots of a virtual machine for the KVM hypervisor. Full snapshots for KVM are quite a new option. If you are striving for greater stability, you can disable this feature.
ldap.*
mac.identifier
- Proposed value: 0
- The value of the second octet for the generated MAC addresses of virtual machines. See the createSequenceBasedMacAddress method. The first octet in the MAC is always 0x1e
private final static static long prefix = 0x1e;
(see in the same file). Thus, by default, all MAC-addresses of machines are generated in the space 1e:00:B3:B4:B5:B6. B3 is a random number, B4-B6 are generated based on the IP address identifier in the MySQL CloudStack database. This option is useful if the IP/MAC spaces of several cloud systems are mixed and collisions are possible.
The parameter group defines the LDAP settings when using LDAP to authenticate users.
management.network.cidr
This parameter allows specifying the network in which the management servers are located. This helps CloudStack correctly define the network for connecting agents to servers, especially in the case of multiple interfaces on management servers.
max.account.*
The resource limits that each newly created account inherits.
max.domain.*
The resource limits that each newly created domain inherits.
max.project.*
The resource limits that each newly created project inherits.
max.template.iso.size
- Proposed value: 50
- Specifies the maximum size of the template or ISO in GB, which can be downloaded to the system via download URL or downloading from a local server.
ping.interval
- Proposed value: 60
- The parameter specifies how often the management server communicates with an agent to check its status. If the parameter value is large, the management server will identify the alarm states with a significant delay. Used together with
ping.timeout
.
ping.timeout
- Proposed value: 2.5
- The parameter specifies the multiplier by which
ping.interval
is multiplied to calculate the time period after which the agent is considered unavailable. For example, 60 x 2.5 = 150, so the agent will be considered unavailable after two check intervals. If you specify, for example, 3.5, then the agent will be considered unavailable after three check intervals.
resourcecount.check.interval
- Proposed value: 300
- The parameter specifies the time of recursive recalculation of allocated resources in the domains hierarchy. The fact is that CloudStack does not perform a multi-level recalculation up the hierarchy at each change in accounts or subdomains within the domain. The periodic function performs this procedure once every
resourcecount.check.interval
seconds.
router.aggregation.command.each.timeout
- Proposed value: 600
- An important parameter that determines how long the management server waits for a response from the VR when sending a request to it. We recommend setting this parameter to a higher value. In this case, it is 10 minutes. It is significant because when the VR starts up slowly (which can happen due to various factors, for example, the load), the management server will register a timeout and try to restart the VR. As a result, it can be unavailable for a long period of time and may never restart.
router.cpu.mhz
The parameter specifies what CPU frequency should be set when VR starts. In the case of heavily-loaded VRs with many running services, it may be reasonable to increase this parameter. However, do not put it higher than the frequency of CPU of the nodes allows, otherwise, CloudStack will not be able to find an appropriate node and VR will not start.
secstorage.encrypt.copy
- Proposed value: true
- Determines whether an SSVM will use HTTPS. Used together with
secstorage.ssl.cert.domain
.
secstorage.ssl.cert.domain
- Proposed value: *.yourdomain.com
- An important parameter that determines what FQDN CloudStack will generate for an SSVM. This parameter is used to implement the downloading of volumes and templates from the CloudStack local server. The option will not work without correct configuration. CloudStack generates FQDN in the form 1-2-3-4.yourdomain.com, so you must support wildcard domain for *.domain.com and deploy wildcard certificate in the CloudStack (see also
consoleproxy.sslEnabled
,consoleproxy.url.domain
).
secstorage.service.offering
- Proposed value: uuid
- The parameter specifies which service offering to use for the SSVM machine. You must specify the UUID of the service offering. For small or private clouds, configuring this parameter does not make sense, because by default the service offering resources, used by CloudStack for SSVMs, are sufficient. Though, if you want to provide more CPUs to an SSVM or place it on specific nodes, this parameter allows doing it.
secstorage.vm.mtu.size
- Proposed value: 1500, 9000
- If the Secondary Storage communication network is configured with jumbo frames support and the secondary storage also supports jumbo frames, here you need to specify an MTU equal to the value set for jumbo frames.
snapshot.max.*
The section defines how many cloud users can create hourly, daily, weekly, monthly scheduled disk snapshots. We recommend that you disable the hourly snapshots completely because there are always users who will overdo it. That may lead to incorrect work of their VMs as every hour the VMs will stay “frozen” for some seconds.
We recommend that you consider disabling Scheduled Snapshots via dynamic roles, as they can create a false illusion of consistent backups.
storage.max.volume.size
- Proposed value: 2000
- The parameter specifies the maximum volume size supported by the cloud. In this example, it is specified as 2 TB. As a virtual machine allows attaching multiple volumes, there is usually no need to use a single large volume. Primarily, this parameter should be calculated based on the storage capabilities and the storage data transfer network capacity. For example, to copy a snapshot of a 2 TB volume to the Secondary Storage via a 10Gbit/s link will take no less than 35 minutes, in the absence of IO limits, that will also perform a high IO load on the disk system.
There are always mutually exclusive settings that need to be considered.
storage.max.volume.upload.size
- Proposed value: 500
- The parameter specifies the maximum size of the volume that can be imported into the system. The parameter is quite safe, however, the specific value is determined by the capabilities of your network, SSVM and Secondary Storage.
storage.overprovisioning.factor
- Proposed value: 4
- This parameter controls the degree of over-provisioning for the Primary Storage. We propose to set this parameter to a rather large value if you plan to implement the “Pay as you go” model of storage growth and use tools for available disk space monitoring. If the storage is added to the cloud fully configured, it is better to set it to “1”.
vm.allocation.algorithm
- Proposed value: firstfitleastconsumed
- This parameter is used by the FirstFitDeploymentPlanner to choose the algorithm for selecting candidate nodes for deploying a new virtual machine. The “firstfitleastconsumed” value corresponds to the algorithm for selecting nodes with less amount of used resources. This strategy works together with the
host.capacityType.to.order.clusters
parameter.
In general, understanding planners is rather challenging. We recommend following the source code to understand the basic operation principles in details.
The following set of parameters:
vm.deployment.planner=FirstFitPlanner
vm.allocation.algorithm=firstfitleastconsumed
host.capacityType.to.order.clusters=RAM
will correspond to the algorithm for deployment of a new VM on the nodes with minimum used RAM.
And the following set of parameters
vm.deployment.planner=FirstFitPlanner
vm.allocation.algorithm=firstfitleastconsumed
host.capacityType.to.order.clusters=CPU
will correspond to the algorithm for deployment of a new VM on the nodes with minimum used CPU.
vm.deployment.planner
- Proposed value: FirstFitPlanner
- Defines a planner for deploying new virtual machines on the nodes. It deploys machines on the first suitable node, according to the placement algorithm (see the previous section). There are other planners that you can consider for application, for example, UserDispersingPlanner, UserConcentratedPodPlanner. For random deployment of machines, FirstFitPlanner with the corresponding algorithm is used.
vm.disk.throttling.*
The parameter group that defines default IO and disk throughput limits. We recommend using it as a tool for preventing configuration errors related to the human factor.
vm.network.throttling.rate
The parameter specifies a default network throughput limit. We recommend using it as a tool for preventing configuration errors related to the human factor.
vmsnapshot.expire.interval
- Proposed value: -1
- After what time the full snapshot of the machine becomes invalid and is automatically deleted. The default value means ‘never’.
vmsnapshot.max
- Proposed value: 1
- The parameter specifies what a maximum number of snapshots a user can create for a virtual machine. In this case, the value is set to 1, that is, to create a new snapshot of the machine a user must delete the previous one.
Hypervisor Capabilities View
KVM:default
- Max Guest Limit: 150
- Modern nodes allow starting many virtual machines. The default value is conservative (50), that means even if there are available resources on the node, it cannot host more than 50 machines. When using 128-256 GB RAM nodes, it is reasonable to specify Max Guest Limit = 150-200.
Since each cloud has an individual design and properties, other parameters may be significant for you. However, we provide the information on those parameters that we deal with in our projects. If you would like to share information on any other important parameters, please, contact us at info (at) bitworks.software and we will add them to the article.