- Fix undefined reference to __popcountdi2 by adding __POPCNT__ check
- Use Brian Kernighan's algorithm for better fallback performance
- Improve C compatibility by using NULL instead of nullptr
- Use stdint.h instead of cstdint for better C compatibility
- Prioritize MSVC __popcnt64 over GCC builtin for better reliability
This resolves linking errors in environments where compiler builtins
are not properly linked, particularly affecting test compilation.
- Use stdint.h consistently for all modes to avoid C++03/C++11 conflicts
- Prevents vector template conflicts between standard library and simplestl
- Resolves 'wrong number of template arguments' errors in aarch64-native CI
- Fix C++03 compatibility by using <stdint.h> instead of <cstdint>
- Fix get_big_cpu_count() to return 0 when no big cores detected
- Resolves multiheadattention test failures caused by thread count changes
- Ensures compatibility with simplestl-simplemath mode
- Add architecture-specific conditional compilation for __popcnt64
- __popcnt64 is only available on x86/x64, not on ARM architectures
- Use fallback implementation for ARM and other non-x86 architectures
- Resolves LNK2019 unresolved external symbol error on Windows ARM builds
- Maintains performance on x86/x64 while ensuring compatibility across all platforms
- Add conditional header includes for uint64_t in all build modes
- Include <stdint.h> in SIMPLESTL mode, <cstdint> in normal mode
- Move standard library headers to conditional compilation blocks
- Fix unsafe bit shift operations that could cause undefined behavior
- Ensure >64 CPU support works correctly in both SIMPLESTL and normal modes
- Tested successfully in NCNN_SIMPLESTL=ON mode
- Fix compilation error for std::pair usage in Windows processor detection
- std::pair requires <utility> header to be explicitly included
- Ensures compatibility across different compilers and environments
- Add #include <cstdint> to cpu.h, cpu.cpp, and platform.h.in
- Implement extended CpuSet class supporting >64 CPUs
- Add fast path for <=64 CPUs and extended path for >64 CPUs
- Include necessary headers for std::max, std::vector, memset, etc.
- Fix original code's missing stdint.h includes for uint64_t usage
- Maintain backward compatibility with platform-specific APIs
Fixes#6142
- synchronize the latest English content to the Chinese documentation
- correct spelling errors in the English version of glsl-extention
- Fix spelling 'enable_validation_layer' in src/gpu.cpp
* Use platform-specific APIs for environment variables
The previous patch used `putenv` as a quick fix for Windows compatibility. However, `putenv` is a legacy API and not the recommended choice.
This commit replaces the single `putenv` call with the most appropriate function for each platform:
- On Windows, it now uses the modern and secure `_putenv_s`.
- On Unix-like systems, it uses the standard `setenv`.
---------
Co-authored-by: nihui <shuizhuyuanluo@126.com>
vkimagemat was originally used as a mat storage in the hope of improving performance on old adreno gpus, but in fact it is slower than the cpu in most cases and is no longer suitable for the latest adreno architecture and large shapes