I am generating code for a deep learning network with coder.DeepLearningConfig(TargetLibrary = 'none'). Which code generation configuration settings should I use to optimize the performance of the generated code?
Prashant Kumar answered .
2025-11-20
>> cfg = coder.config('lib');
>> cfg.Hardware = coder.Hardware('Raspberry Pi');
>> cfg.CodeReplacementLibrary = "GCC ARM Cortex-A";
>> cfg.EnableOpenMP = true;
>> cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary = 'none');
>> cfg.DeepLearningConfig.LearnablesCompression = 'bfloat16'; % Requires R2023a or later
>> cfg = coder.config('lib');
>> cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->ARM Cortex-A';
>> cfg.CodeReplacementLibrary = "GCC ARM Cortex-A";
>> cfg.EnableOpenMP = true;
>> cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary = 'none');
>> cfg.DeepLearningConfig.LearnablesCompression = 'bfloat16'; % Requires R2023a or later
>> cfg = coder.config('lib');
>> cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->ARM Cortex-M';
>> cfg.CodeReplacementLibrary = 'ARM Cortex-M';
>> cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary = 'none');
>> cfg.DeepLearningConfig.LearnablesCompression = 'bfloat16'; % Requires R2023a or later
>> cfg = coder.config('lib');
>> cfg.HardwareImplementation.ProdHWDeviceType = 'Intel->x86-64 (Linux 64)'; % If deploying on Linux
>> cfg.InstructionSetExtensions = 'AVX512F'; % or 'AVX2' if 'AVX512F' is not available
>> cfg.EnableOpenMP = true;
>> cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary = 'none');
>> cfg.DeepLearningConfig.LearnablesCompression = 'bfloat16'; % Requires R2023a or later
>> cfg = coder.config('lib');
>> cfg.HardwareImplementation.ProdHWDeviceType = 'AMD->x86-64 (Linux 64)'; % If deploying on Linux
>> cfg.InstructionSetExtensions = 'AVX512F'; % or 'AVX2' if 'AVX512F' is not available
>> cfg.EnableOpenMP = true;
>> cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary = 'none');
>> cfg.DeepLearningConfig.LearnablesCompression = 'bfloat16'; % Requires R2023a or later