Mellanox Technologies ********************* Congestion Control Manager for Mellanox OFED User Guide =============================================================================== Table of contents =============================================================================== 1. Congestion Control Overview 2. Installation 3. Running OpenSM with Congestion Control Manager 4. Example of Congestion Control Manager options file =============================================================================== 1. Congestion Control Overview =============================================================================== Congestion Control mechanism controls traffic entry into a network and attempts to avoid oversubscription of any of the processing or link capabilities of the intermediate nodes and networks. Additionally is takes resource reducing steps by reducing the rate of sending packets. Congestion Control Manager enables and configures Congestion Control mechanism on fabric nodes (HCAs and switches). =============================================================================== 2. Installation =============================================================================== Congestion Control Manager is a Subnet Manager plug-in, i.e. it is a shared library (libccmgr.so) that is dynamically loaded by the Subnet Manager. Congestion Control Manager is installed as part of Mellanox OFED installation. =============================================================================== 3. Running OpenSM with Congestion Control Manager =============================================================================== Congestion Control (CC) Manager can be enabled/disabled through SM options file. To do so, perform the following: - Create the file. Run: 'opensm -c ' - Find the 'event_plugin_name' option in the file and add 'ccmgr': # Event plugin name(s) event_plugin_name ccmgr - Run SM with the new options file: 'opensm -F ' CC Manager can provide options file to fine-tune CC mechanism and CC Manager behavior. To do so, perform the following: - Find the 'event_plugin_options' option in the file and add 'ccmgr --conf_file ': # Options string that would be passed to the plugin(s) event_plugin_options ccmgr --conf_file - Run SM with the new options file: 'opensm -F ' - Default location for the above file is /etc/opensm/cc_mgr.conf See an example of CC Manager options file with all the default values in section 4. Important note: -------------- Once CC is enabled on the fabric nodes, to completely turn off Congestion Control it is not enough to run SM w/o the CC Manager, because then the hardware will continue to function in accordance to the previous CC configuration. Instead, you will need to actively turn it off to disable it. To turn it off, set 'enable' to 'FALSE' in the CC Manager configuration file, and run OpenSM once with this configuration. =============================================================================== 4. Example of Congestion Control Manager options file =============================================================================== ################################## # General Options # Enable/Disable Congestion Control mechanism on fabric nodes # Values: < TRUE | FALSE > enable TRUE # Congestion Control Key (64-bit value) cc_key 0 # Indicate number of nodes. # CC Table values are calculated based on this number. # Values: [0-48K] # Default: 0 (base CCT calculation on the current subnet size) num_hosts 0 ################################## # Switch Options # Indicates how aggressive congestion marking should be. # Values: [0-0xf] (0 - no packet marking, 0xf - very aggressive) threshold 0xf # The value that provides the mean number of packets # between marking eligible packets with a FECN. # Values: [0-0xffff] marking_rate 0xa # Any packet less than this size [bytes] will not be marked with FECN. # Values: [0-0x3fc0] packet_size 0x200 ################################## # CA Options # Specifies congestion control attribute for this port # Values: 0 - QP based congestion control, # 1 - SL/Port based congestion control port_control 0 # An array of sixteen bits, one for each SL. # Each bit indicates whether or not the corresponding # SL entry is to be modified. ca_control_map 0xffff # Sets the CC Table Index (CCTI) increase. # Values: [0-127] # 0: no CCTI increase - this port will not react upon reveiving BECNs ccti_increase 1 # Sets the congestion log event trigger threshold. # Values: [1-126] trigger_threshold 2 # Sets the CC Table Index (CCTI) minimum. # Values: [0-127] ccti_min 0 # Sets all the CC Table entries to a specified value (first entry will remain 0). # Values: # Up to 128 (CCT size) items in the list. # If less than 128 items provided, rest of the CCT will be filled with the last item's value. # 0: CCT calculation based on number of nodes. cct 0 # Sets for all SL's the given CCTI timer. # Values: [0-0xffff] # 0: CCTI timer calculation based on number of nodes. ccti_timer 0 #################################### # CC MGR Options # When number of errors exceeds 'max_errors' of send/receive errors or # timeouts in less than 'error_window' seconds, CC MGR will abort and # will allow OpenSM to proceed. # Values for both options: [0-0xffff] # max_errors = 0: zero tollerance - abort configuration on first error. # error_window = 0: mechanism disabled - no error checking. max_errors 5 error_window 5 # CC MGR will collect statistics from all nodes every cc_statistics_cycle [seconds] # Values: [0-3600] # 0: no statistics will be collected. cc_statistics_cycle 20