From 819a396df304394c92881bfc31e53bbc3c2d75cb Mon Sep 17 00:00:00 2001 From: lvzhangcheng Date: Sat, 19 Sep 2020 16:09:41 +0800 Subject: [PATCH] add coding guild en version --- security/coding_guild_cpp_en.md | 837 +++++++++++++++++++++++++++++ security/coding_guild_cpp_zh_cn.md | 4 +- security/coding_guild_python_en.md | 340 ++++++++++++ 3 files changed, 1179 insertions(+), 2 deletions(-) create mode 100644 security/coding_guild_cpp_en.md create mode 100644 security/coding_guild_python_en.md diff --git a/security/coding_guild_cpp_en.md b/security/coding_guild_cpp_en.md new file mode 100644 index 0000000..cffe132 --- /dev/null +++ b/security/coding_guild_cpp_en.md @@ -0,0 +1,837 @@ + + +- [Note](#note) +- [Scope](#scope) + - [Code Style](#1-code-style) + - [Naming](#11-naming) + - [Format](#12-format) + - [Comment](#13-comment) + - [General Coding](#2-general-coding) + - [Code Design](#21-code-design) + - [Header File and Preprocessing](#22-header-file-and-preprocessing) + - [Data Type](#23-data-type) + - [Constant](#24-constant) + - [Variable](#25-variable) + - [Expression](#26-expression) + - [Conversion](#27-conversion) + - [Control Statement](#28-control-statement) + - [Declaration and Initialization](#29-declaration-and-initialization) + - [Pointer and Array](#210-pointer-and-array) + - [String](#211-string) + - [Assert](#212-assert) + - [Class and Object](#213-class-and-object) + - [Function Design](#214-function-design) + - [Function Usage](#215-function-usage) + - [Memory](#216-memory) + - [File](#217-file) + - [Secure Function](#218-secure-function) + + + +## Note + +This document is developed based on [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html), Huawei C++ Coding Style Guide, Huawei secure coding standards, and industry consensus. To participate in the MindSpore community, please comply with this style guide, and then the Google C++ Style Guide. +If you disagree with the rules, you are advised to submit an issue and provide reasons. The issue can take effect after being reviewed, accepted, and modified by the MindSpore community operation team. + +## Scope + +MindSpore open source community + +------------------ + +### 1. Code Style + +#### 1.1 Naming + +##### Rule 1.1.1 File Naming + +C++ files are named in the format of lowercase letters + underscores (_). The file name extension is `.cc`. The header file name extension is `.h`. The unit test file name ends with `_test.cc`. + +> a_b_c.h +> a_b_c.cc +> a_b_c_test.cc + +##### Rule 1.1.2 Use lowercase letters and underscores (_) to name local variables and parameters. + +```cpp +void FooBar(int func_param) { + int local_var; +} +``` + +##### Rule 1.1.3 Use lowercase letters and underscores (_) to name member variables, with an underscore (_) as the suffix. + +```cpp +class FooBar { + public: + int mamber_var_; +}; +``` + +##### Rule 1.1.4 Use uppercase letters and underscores (_) in macro names. + +```cpp +#define MS_LOG(...) +``` + +##### Rule 1.1.5 Name constants and enumerations in the CamelCase style starting with letterr "k." + +```cpp +const int kDaysInAWeek = 7; + +enum UrlTableErrors { + kOk = 0, + kErrorOutOfMemory, + kErrorMalformedInput, +}; +``` + +#### 1.2 Format + +##### Recommendation 1.2.1 Each line contains a maximum of 120 characters. + +If a line contains more than 120 characters, start a new line properly. + +##### Rule 1.2.2 Use spaces to indent, two at a time. + +##### Rule 1.2.3 When declaring a pointer or referencing variables or parameters, follow variable names with `&` and `*` and place a space on the other side. + +```cpp + char *c; + const std::string &str; +``` + +##### Rule 1.2.4 Use braces to include an if statement. + +```cpp +// Even if the if branch code is within one line, braces are required. +if (cond) { + single line code; +} +``` + +##### Rule 1.2.5 Use braces for loop statements such as for and while statements, even if the loop body is empty or there is only one loop statement. + +##### Rule 1.2.6 Keep a consistent line break style for expressions and ensure that operators are placed at the end of a line. + +```cpp +int a = a_very_long_expression + + a_very_very_long_expression + + a_very_very_very_long_expression; +``` + +##### Rule 1.2.7 Each variable definition or assignment statement occupies one line. + +```cpp +a = 1; +b = 2; +c = 3; +``` + +#### 1.3 Comment + +##### Rule 1.3.1 File header comments contain copyright statements. + +All .h and .cc files must contain the following copyright statements: +```cpp + +/** + * Copyright 2019 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +``` + +> Notes: +> Files created in 2020 should contain `Copyright 2020 Huawei Technologies Co., Ltd`. +> Files created in 2019 and modified in 2020 should contain `Copyright 2019-2020 Huawei Technologies Co., Ltd`. + +##### Rule 1.3.2 A comment is placed above or to the right of the code. There must be a space between the comment character and the comment content, and at least one space between the code and its comment on the right. Use `//`, not `/**/`. + +```cpp +// this is multi- +// line comment +int foo; // this single-line comment +``` + +##### Rule 1.3.3 Do not use comments such as TODO, TBD, and FIXME in code. You are advised to submit an issue for tracking. + +##### Recommendation 1.3.4 Function header comments with no content are forbidden. + +Not all functions require header comments. You are advised to use the name of the function as its comment and write header comments if there is the need. For the information that cannot be expressed by the function prototype but is expected to be known by readers, function header comments are required. +Do not write useless or redundant function headers. The function header comments are optional, including but not limited to function description, return values, performance constraints, usage, memory conventions, algorithm implementation, and reentrant requirements. + + +### 2. General Coding + +#### 2.1 Code Design + +##### Rule 2.1.1 Check the validity of all external data, including but not limited to function input parameters, external input named lines, files, environment variables, and user data. + +##### Rule 2.1.2 When transferring function execution results, preferentially use return values and avoid output parameters. + +```cpp + FooBar *Func(const std::string &in); +``` + +##### Rule 2.1.3 Delete invalid, redundant, or never-executed code. + +Although most modern compilers in many cases can alert you to invalid or never executed code, responding alarms should be identified and cleared. +Identify and delete invalid statements or expressions from the code. + +##### Rule 2.1.4 Follow additional specifications to the C++ exception mechanism. + +###### Rule 2.1.4.1 Specify the types of exceptions to be captured. Do not capture all exceptions. + +```cpp +// Incorrect +try { + // do something; +} catch (...) { + // do something; +} +// Correct +try { + // do something; +} catch (const std::bad_alloc &e) { + // do something; +} +``` + +#### 2.2 Header File and Preprocessing + +##### Rule 2.2.1 Use the new standard C++ header file. + +```cpp +// Correct +#include +// Incorrect +#include +``` + +##### Rule 2.2.2 Header file cyclic dependency is forbidden. + +An example of cyclic dependency (also known as circular dependency) is: a.h contains b.h, b.h contains c.h, and c.h contains a.h. If any of these header files is modified, all code containing a.h, b.h, and c.h needs to be recompiled. +The cyclic dependency of header files reflects an obviously unreasonable architecture design, which can be avoided through optimization. + +##### Rule 2.2.3 Do not include unnecessary header files. + +##### Rule 2.2.4 It is prohibited to reference external function interfaces and variables in extern declaration mode. + +##### Rule 2.2.5 Do not include header files in extern "C". + +##### Rule 2.2.6 Do not use "using" to import namespace in a header file or before #include statements. + + +#### 2.3 Data Type + +##### Recommendation 2.3.1 Do not abuse typedef or #define to alias basic types. + +##### Rule 2.3.2 Use “using” instead of typedef to define the alias of a type to avoid shot-bomb modification caused by type changes. + +```cpp +// Correct +using FooBarPtr = std::shared_ptr; +// Incorrect +typedef std::shared_ptr FooBarPtr; +``` + +#### 2.4 Constant + +##### Rule 2.4.1 Do not use macros to replace constants. + +##### Rule 2.4.2 Do not use magic numbers or character strings. + +##### Recommendation 2.4.3 Ensure that a constant has only one responsibility. + +#### 2.5 Variable + +##### Recommendation 2.5.1 Use namespaces to manage global constants. If the global constants are closely tied to a class, you can use static member constants to manage them. + +```cpp +namespace foo { + int kGlobalVar; + + class Bar { + private: + static int static_member_var_; + }; +} +``` + +##### Rule 2.5.2 Do not use global variables. Use the singleton pattern cautiously to avoid abuse. + +##### Rule 2.5.3 A variable cannot be referenced again if it is contained in an increment or decrement operation in an expression. + +##### Rule 2.5.4 After the resource is released, immediately assign a new value to the pointer variable that points to a resource handle or descriptor, or set the value to NULL. + +##### Rule 2.5.5 Do not use uninitialized variables. + +#### 2.6 Expression + +##### Recommendation 2.6.1 When comparing expressions, follow the principle that the left side tends to change and the right side tends to remain unchanged. + +```cpp +// Correct +if (ret != SUCCESS) { + ... +} + +// Incorrect +if (SUCCESS != ret) { + ... +} +``` + +##### Rule 2.6.2 Use parentheses to specify the operator precedence to avoid rookie errors. + +```cpp +// Correct +if (cond1 || (cond2 && cond3)) { + ... +} + +// Incorrect +if (cond1 || cond2 && cond3) { + ... +} +``` + +#### 2.7 Conversion + +##### Rule 2.7.1 Use the type casting provided by the C++ instead of the C style. Do not use const_cast and reinterpret_cast. + + +#### 2.8 Control Statement + +##### Rule 2.8.1 A switch statement must have a default branch. + + +#### 2.9 Declaration and Initialization + +##### Rule 2.9.1 Do not use `memcpy_s` or `memset_s` to initialize non-POD objects. + + +#### 2.10 Pointer and Array + +##### Rule 2.10.1 Do not use the pointer returned by c_str () of std::string. + +```cpp +// Incorrect +const char * a = std::to_string(12345).c_str(); +``` + +##### Rule 2.10.2 Prefer `unique_ptr` to `shared_ptr`. + +##### Rule 2.10.3 Create `shared_ptr` by using `std::make_shared` instead of `new`. + +```cpp +// Correct +std::shared_ptr foo = std::make_shared(); +// Incorrect +std::shared_ptr foo(new FooBar()); +``` + +##### Rule 2.10.4 Use a smart pointer to manage objects. Do not use new or delete. + +##### Rule 2.10.5 Do not use auto_ptr. + +##### Rule 2.10.6 For formal parameters of pointer and reference types, if the parameters do not need to be modified, use const. + +##### Rule 2.10.7 Use the array length as a function parameter when the array is a function parameter. + +```cpp +int ParseMsg(BYTE *msg, size_t msgLen) { + ... +} +``` + +#### 2.11 String + +##### Rule 2.11.1 When saving a string, ensure that it has '\0' at the end. + +#### 2.12 Assert + +##### Rule 2.12.1 Assert cannot be used to verify errors that may occur when the program is running. To handle possible running errors, use error processing code. + +#### 2.13 Class and Object + +##### Rule 2.13.1 When a single object is released, delete is used. When an array object is released, delete [] is used. + +```cpp +const int kSize = 5; +int *number_array = new int[kSize]; +int *number = new int(); +... +delete[] number_array; +number_array = nullptr; +delete number; +number = nullptr; +``` +##### Rule 2.13.2 Do not use std::move to operate the const object. + +##### Rule 2.13.3 Strictly use virtual/override/final to modify virtual functions. + +```cpp +class Base { + public: + virtual void Func(); +}; + +class Derived : public Base { + public: + void Func() override; +}; + +class FinalDerived : public Derived { + public: + void Func() final; +}; +``` + +#### 2.14 Function Design + +##### Rule 2.14.1 Use the RAII feature to help track dynamic allocation. + +```cpp +// Correct +{ + std::lock_guard lock(mutex_); + ... +} +``` + +##### Rule 2.14.2 Avoid capturing by reference in lambdas that will not be used locally. + +```cpp +{ + int local_var = 1; + auto func = [&]() { ...; std::cout << local_var << std::endl; }; + thread_pool.commit(func); +} +``` + +##### Rule 2.14.3 Do not use default parameter values for virtual functions. + +##### Recommendation 2.14.4 Use strongly typed parameters or member variables. Do not use void*. + +#### 2.15 Function Usage + +##### Rule 2.15.1 The input parameter must be transferred before the output parameter. + +```cpp + bool Func(const std::string &in, FooBar *out1, FooBar *out2); +``` + +##### Rule 2.15.2 Use `const T &` as the input parameter and `T *` as the output parameter for function transfer. + +```cpp + bool Func(const std::string &in, FooBar *out1, FooBar *out2); +``` + +##### Rule 2.15.3 In the scenario where ownership is not involved, use T * or const T & instead of the smart pointer as the transfer parameter. + +```cpp + // Correct + bool Func(const FooBar &in); + // Incorrect + bool Func(std::shared_ptr in); +``` + +##### Rule 2.15.4 To transfer the ownership, you are advised to use shared_ptr + move to transfer parameters. + +```cpp +class Foo { + public: + explicit Foo(shared_ptr x):x_(std::move(x)){} + private: + shared_ptr x_; +}; +``` + +##### Rule 2.15.5 Use explicit to modify single-parameter constructors and do not use explicit to modify multi-parameter constructors. + +```cpp + explicit Foo(int x); //good :white_check_mark: + explicit Foo(int x, int y=0); //good :white_check_mark: + Foo(int x, int y=0); //bad :x: + explicit Foo(int x, int y); //bad :x: +``` + +##### Rule 2.15.6 Copy constructors and copy assignment operators should be implemented or hidden together. + +```cpp +class Foo { + private: + Foo(const Foo&) = default; + Foo& operator=(const Foo&) = default; + Foo(Foo&&) = delete; + Foo& operator=(Foo&&) = delete; +}; +``` + +##### Rule 2.15.7 [Question] Do not save or delete pointer parameters. + +##### Rule 2.15.8 [Question] Do not use insecure functions as listed. + +##### Rule 2.15.9 [Question] Do not use insecure exit functions as listed. + +```cpp +{ + Kill(...); // If you invoke kill to forcibly terminate other processes (such as kill -9), the resources of other processes cannot be cleared. + TerminateProcess(); // If you call the erminateProcess function to forcibly terminate other processes, the resources of other processes cannot be cleared. + pthread_exit(); // Do not terminate a thread. The thread functions will exit automatically and safely after the execution is complete. + ExitThread(); // Do not terminate a thread. The thread functions will exit automatically and safely after the execution is complete. + exit(); // Do not call any function except the main function. The program must exit safely. + ExitProcess(); // Do not call any function except the main function. The program must exit safely. + abort(); //Forbidden. If abort is used, the program exits immediately and resources cannot be cleared. +} +``` + +##### Rule 2.15.10 Do not use the rand function to generate pseudo-random numbers for security purposes. + +The rand() function in the C standard library generates pseudo-random numbers. To generate random numbers, use /dev/random. + +##### Rule 2.15.11 Do not use the string class to store sensitive information. + +The string class is a character string management class defined in C++. If sensitive information such as passwords is operated through the string class, the sensitive information can be +scattered in various places of the memory and cannot be cleared. + +In the following code, the Foo function obtains the password, saves it to the string variable password, and then transfers it to the VerifyPassword function. In this process, +two copies of the password exist in the memory. + +```cpp +int VerifyPassword(string password) { + //... +} +int Foo() { + string password = GetPassword(); + VerifyPassword(password); + ... +} +``` +Sensitive information must be stored using char or unsigned char. For example: + +```cpp +int VerifyPassword(const char *password) { + //... +} +int Foo() { + char password[MAX_PASSWORD] = {0}; + GetPassword(password, sizeof(password)); + VerifyPassword(password); + ... +} +``` + +##### Rule 2.15.12 Clear sensitive information in the memory immediately after use. + +Sensitive information, such as passwords and keys, must be cleared immediately after being used to prevent attackers from obtaining the information. + + +#### 2.16 Memory +##### Rule 2.16.1 Check whether memory allocation is successful. + +If the memory allocation fails, the subsequent operations may have undefined behavior risks. For example, if malloc fails to be applied for and a null pointer is returned, dereference of the null pointer is an undefined behavior. + +##### Rule 2.16.2 Do not reference uninitialized memory. + +The memory allocated by malloc and new is not initialized to 0. Ensure that the memory is initialized before being referenced. + +##### Rule 2.16.3 Do not use the realloc() function. + +The behavior of the realloc function varies with parameters. This is not a well-designed function. Although it provides some convenience in coding, it can easily cause various bugs. + +##### Rule 2.16.4 Do not use the alloca() function to apply for stack memory. + +Neither POSIX nor C99 defines the alloca() behavior. Some platforms do not support this function. Using alloca() reduces program compatibility and portability. This function requests memory in the stack frame. The requested size may exceed the stack boundary, affecting code execution. + +#### 2.17 File + +##### Rule 2.17.1 File paths must be canonicalized before use. + +A file path that comes from external data must be canonicalized first. If the file path is not canonicalized, attackers can construct a malicious file path to access the file without authorization. +For example, an attacker can construct “../../../etc/passwd” to access any file. +Use the realpath() function in Linux and the PathCanonicalize() function in Windows for file path canonicalization. +[Noncompliant Code Example] +The following code obtains the file name from an external system, concatenates the file name into a file path, and directly reads the file content. As a result, the attacker can read the content of any file. + +```cpp +char *fileName = GetMsgFromRemote(); +... +sprintf_s(untrustPath, sizeof(untrustPath), "/tmp/%s", fileName); +char *text = ReadFileContent(untrustPath); // Bad: Did not check whether the untrustPath can be accessed before the data is read. +``` +[Compliant Code Example] +Canonicalize the file path and then check whether the path is valid in the program. +```cpp +char *fileName = GetMsgFromRemote(); +... +sprintf_s(untrustPath, sizeof(untrustPath), "/tmp/%s", fileName); +char path[PATH_MAX] = {0}; +if (realpath(untrustPath, path) == NULL) { + //error + ... +} +if (!IsValidPath(path)) { // Good: Check whether the file location is correct. + //error + ... +} +char *text = ReadFileContent(untrustPath); +``` +Exceptions: +Command line programs that run on the console, or file paths that are manually entered on the console are exempted from this rule. + + +##### Rule 2.17.2 Do not create temporary files in the shared directory. + +Temporary files of a program must be exclusively used by itself. Otherwise, other users of the shared directory may obtain additional information about the program, resulting in information leakage. Therefore, do not create temporary files that should be used only by the program itself in any shared directory. +For example, the /tmp directory in Linux is a shared directory that all users can access. Do not create temporary files that should be used only by the program itself in this directory. + +#### 2.18 Secure Function + + + + + + + + + + + + + + + + + + + +
Secure Function TypeDescriptionRemarks
xxx_sSecure function API of Huawei Secure C library It can be used when the Huawei Secure C library is integrated.
xxx_sp API of Huawei Secure C library with optimized secure function performance (macro implementation) + If count, destMax, and strSrc are constants, the performance-optimized macro interface displays its effect. If they are variables, the performance optimization effect is not obvious. The macro interface usage policy is as follows: The _s interface is used by default. The _sp interface is restricted in performance-sensitive call sites. The restriction scenarios are as follows: + a) memset_sp and memcpy_sp: destMax and count are constants. + b) strcpy_sp or strcat_sp: destMax is a constant and strSrc is a literal. + c) strncpy_sp or strncat_sp: destMax and count are constants and strSrc is a literal.
+ +##### Rule 2.15.18 Use secure functions provided by the community in the secure function library. Do not use dangerous functions related to memory operations. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Function TypeDangerous FunctionSecure Surrogate Function
Memory copymemcpy or bcopymemcpy_s
wmemcpywmemcpy_s
memmovememmove_s
wmemmovewmemmove_s
String copystrcpystrcpy_s
wcscpywcscpy_s
strncpystrncpy_s
wcsncpywcsncpy_s
Character string concatenationstrcatstrcat_s
wcscatwcscat_s
strncatstrncat_s
wcsncatwcsncat_s
Format outputsprintfsprintf_s
swprintfswprintf_s
vsprintfvsprintf_s
vswprintfvswprintf_s
snprintfsnprintf_s or snprintf_truncated_s
vsnprintfvsnprintf_s or vsnprintf_truncated_s
Format inputscanfscanf_s
wscanfwscanf_s
vscanfvscanf_s
vwscanfvwscanf_s
fscanffscanf_s
fwscanffwscanf_s
vfscanfvfscanf_s
vfwscanfvfwscanf_s
sscanfsscanf_s
swscanfswscanf_s
vsscanfvsscanf_s
vswscanfvswscanf_s
Standard input stream inputgetsgets_s
Memory initializationmemsetmemset_s
+ + +##### Rule 2.18.2 Correctly set the destMax parameter in secure functions. + +##### Rule 2.18.3 Do not encapsulate secure functions. + +##### Rule 2.18.4 Do not rename secure functions using macros. + +```cpp +#define XXX_memcpy_s memcpy_s +#define SEC_MEM_CPY memcpy_s +#define XX_memset_s(dst, dstMax, val, n) memset_s((dst), (dstMax), (val), (n)) +``` + +##### Rule 2.18.5 Do not customize secure functions. + +Using macros to rename secure functions does not help static code scanning tools (non-compiled) customize rules for the misuse of secure functions. In addition, there are various naming styles. +In addition, it is not conducive to reminding the code developer of the real usage of functions, and may easily cause misunderstanding of the code and misuse of the renamed secure functions. Renaming secure functions will not +affect the checking capability of the secure functions. + +```cpp +void MemcpySafe(void *dest, unsigned int destMax, const void *src, unsigned int count) { + ... +} +``` + +##### Rule 2.18.6 Check the return values of secure functions and correctly process them. + +In principle, if a secure function is used, its return value must be checked. If the return value is ! = EOK, this function should be returned immediately, +and cannot be continued. +A secure function may have multiple erroneous return values. If a secure function returns a failure, perform the following operations (one or more) based on specific product scenario before it is returned +: +(1) Record logs. +(2) Return an error. +(3) Call abort to exit the program immediately. + +```cpp +{ + ... + err = memcpy_s(destBuff, destMax, src, srcLen); + if (err != EOK) { + MS_LOG("memcpy_s failed, err = %d\n", err); + return FALSE; + } + ... +} +``` + +##### Rule 2.18.7 Do not use external controllable data as function parameters for starting processes, such as system, popen, WinExec, ShellExecute, execl, xeclp, execle, execv, execvp and CreateProcess. + +##### Rule 2.18.8 Do not use external controllable data as parameters for module loading functions such as dlopen/LoadLibrary. + +##### Rule 2.18.9 Do not call non-asynchronous secure functions in signal processing routines. + +Signal processing routines should be as simple as possible. If a non-asynchronous secure function is called in a signal processing routine, the execution of the function may not generate expected results. +The signal handler in the following code writes logs by calling fprintf(), but the function is not asynchronous secure function. +```cpp +void Handler(int sigNum) { + ... + fprintf(stderr, "%s\n", info); +} +``` + +------------------------------------------ diff --git a/security/coding_guild_cpp_zh_cn.md b/security/coding_guild_cpp_zh_cn.md index 3ee7960..56a3898 100644 --- a/security/coding_guild_cpp_zh_cn.md +++ b/security/coding_guild_cpp_zh_cn.md @@ -421,9 +421,9 @@ class FinalDerived : public Derived { } ``` -##### 规则 2.14.4 禁止虚函数使用缺省参数值. +##### 规则 2.14.3 禁止虚函数使用缺省参数值. -##### 建议 2.14.5 使用强类型参数\成员变量,避免使用void*. +##### 建议 2.14.4 使用强类型参数\成员变量,避免使用void*. #### 2.15 函数使用 diff --git a/security/coding_guild_python_en.md b/security/coding_guild_python_en.md new file mode 100644 index 0000000..97f7d23 --- /dev/null +++ b/security/coding_guild_python_en.md @@ -0,0 +1,340 @@ + + +[Note](#note) +- [Scope](#scope) + - [Code Style](#1-code-style) + - [Naming](#11-naming) + - [Format](#12-format) + - [Comment](#13-comment) + - [Log](#14-log) + - [General Coding](#2-general-coding) + - [Interface Declaration](#21-interface-declaration) + - [Data Verification](#22-data-verification) + - [Abnormal Behavior](#23-abnormal-behavior) + - [Serialization and Deserialization](#24-serialization-and-deserialization) + + + +## Note + +This document is developed based on [PEP 8](https://www.python.org/dev/peps/pep-0008/), Huawei Python Coding Style Guide, Huawei Python Secure Coding Standard, and industry consensus. To participate in the MindSpore community, please comply with this style guide (for contents conflict with the PEP 8 style guide), and then with PEP 8. +If you disagree with the rules, you are advised to submit an issue and provide reasons. The issue can take effect after being reviewed, accepted, and modified by the MindSpore community operation team. + +## Scope + +MindSpore open source community + +------------------------ + + +### 1. Code Style + +#### 1.1 Naming + +##### Rule 1.1.1 Package names and module names are in lowercase and cannot contain underscores (_). + +##### Rule 1.1.2 Class names are in the CamelCase style. The first letter must be capitalized, and the prefix is a private class underscore (_). + +```python +class _Foo: + _instance = None + pass +``` + +##### Rule 1.1.3 Function names and variable names are in lowercase and separated by underscores (_) when containing multiple words. + +```python +def _func_example(path): + pass +``` + +##### Recommendation 1.1.4 Do not use single-character names except for iterators and counters. + +#### 1.2 Format + +##### Rule 1.2.1 Ensure that each line contains a maximum of 120 characters. + +If a line contains more than 120 characters, start a new line properly. + +##### Rule 1.2.2 Use spaces to indent, four at a time. Tab indent is forbidden. + +##### Rule 1.2.3 The import sequence is as follows: standard library, third-party, and customization module. + +##### Rule 1.2.4 Do not use parentheses in return statements and conditional statements. + +##### Rule 1.2.5 There are two blank lines between a module-level function and a class, and one blank line between class member functions. Add blank lines between comments and code as needed. In principle, there should be no more than two blank lines. + +##### Rule 1.2.6 Delete invalid or redundant code directly. Do not retain the code in the form of comments or TODO. You are advised to submit an issue record. + +#### 1.3 Comment + +##### Rule 1.3.1 File header comments must contain copyright statements. + +All Python files must contain the following copyright statements: + +```python + +# Copyright 2019 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" +Add notes. +""" +import xxx + +``` + +> Notes: +> Files created in 2020 should contain `Copyright 2020 Huawei Technologies Co., Ltd`. +> Files created in 2019 and modified in 2020 should contain `Copyright 2019-2020 Huawei Technologies Co., Ltd`. + +##### Rule 1.3.2 Comply with the comment formats of external classes, methods, operators, and cells: + +- The comment formats of `class` and `def` are the same. Use Python comments which is generally accepted by the industry, and indent the comments under a declaration. All `class` and `def`should be commented. You can write only one introduction for the classes and methods in the module. +- For details about the comment formats, see [MindSpore Comment Specifications](https://gitlab.huawei.com/mindspore/docs/wikis/Python-API%E6%B3%A8%E9%87%8A%E8%A7%84%E8%8C%83). + +##### Rule 1.3.3 Do not use comments to mask Pylint alarms. + +#### 1.4 Log + +##### Rule 1.4.1 Capitalize the first letter of the exception log text. + +##### Rule 1.4.2 Variable names in log texts must be marked with single quotation marks. + +### 2. General Coding + +#### 2.1 Interface Declaration + +##### Rule 2.1.1 User interfaces are described in __all__ of a file, and __all__ is placed between import and code. + +##### Rule 2.1.2 Use underscores (_) to prefixe the non-external methods used by the current file. Method names used across modules do not need underscore prefixes. User interfaces are declared in __all__ of a file. + +#### 2.2 Data Verification + +##### Rule 2.2.1 Check the validity of all external data, including but not limited to function input parameters, external input named lines, file formats, file sizes, environment variables, and user data. + +##### Recommendation 2.2.2 File paths must be canonicalized before use. + +A file path that comes from external data must be canonicalized first. If the file path is not canonicalized, attackers can construct a malicious file path to access the file without authorization. +For example, an attacker can construct ../../../etc/passwd to access any file. +For example, use the realpath() function in Linux and the PathCanonicalize() function in Windows for file path canonicalization. +[Noncompliant Code Example] +The following code obtains the file name from an external system, concatenates the file name into a file path, and directly reads the file content. As a result, the attacker can read the content of any file. +```python + The following is an example of error code: +``` +[Compliant Code Example] +Canonicalize the file path and then check whether the path is valid in the program. +```python + The following is an example of correct code: +``` +[Exceptions] +Command line programs that run on the console, or file paths that are manually entered on the console are exempted from this rule. + +##### Rule 2.2.3 Do not invoke the OS command parser to run commands or programs. + +If untrusted input that is not verified is used as a parameter or part of a system command, vulnerability may occur in a command injection. For the command injection vulnerability, the command is executed at the same privilege level as the Python application. It provides a function similar to shells for attackers. In Python, os.system or os.popen is often used to call a new process. If the command to be executed comes from external input, command and parameter injection may occur. +When running the command, pay attention to the following points: +1. Do not concatenate the input parameters in the character string for command execution. If the input parameters must be concatenated, perform whitelist filtering on the input parameters. +2. Verify the type of the input parameters. For example, integer data can be mandatorily converted into integers. +3. Ensure that the formatted string is correct. For example, use %d instead of %s to concatenate parameters of the int type. + +[Noncompliant Code Example 1] +The attacker can find the value of the environment variable APPHOME and place the attacking program against the constant INITCMD in the corresponding directory +to execute the attack. + +```python + home = os.getenv('APPHOME') + cmd = os.path.join(home, INITCMD) + os.system(cmd) +``` + +[Noncompliant Code Example 2] +The value of the backuptype attribute is not verified. The value is entered by the user and may be attacked. For example, the value entered by the user is && del. +c:\\dbms\\*.* ": + +```python + # The value is obtained from the user configuration. + btype = req.field('backuptype') + cmd = "cmd.exe /K \"c:\\util\\rmanDB.bat " + btype + "&&c:\\util\\cleanup.bat\"" + os.system(cmd) +``` + +[Noncompliant Code Example 3] +The value of the backuptype attribute is not verified. The value is entered by the user, which may be attacked. For example, the value entered by the user is && del. +c:\\dbms\\*.* ": + +```python + import os + import sys + try: + print(os.system("ls " + sys.argv[1])) + except Exception as ex: + print('exception:', ex) +``` +Attackers can run the following command to exploit this vulnerability: +```python + python test.py ". && echo bad" +``` +Actually, the following two commands are executed: +```python + ls . + echo bad +``` + +[Compliant Code Example] +Do not use os.system. You can use standard APIs instead of running system commands to complete the tasks. +```python + import os + import sys + try: + print(os.listdir(sys.argv[1])) + except Exception as ex: + print(ex) +``` + +#### 2.3 Abnormal Behavior + +##### Rule 2.3.1 Exceptions must be properly handled. Do not suppress or ignore exceptions found in the check results. + +Ensure that the programs in each except block continue to operate only when they are valid. The except block must either recover from an exception or throw another exception suitable for the context of the current catch block to allow the most adjacent outer try-except statement block to recover. +[Compliant Code Example] +The correct method is to avoid using os.system. You can use standard APIs instead of running system commands to complete the tasks. +```python + validFlag = False + while not validFlag: + try: + # If requested file does not exist, throws FileNotFoundError + # If requested file exists, sets validFlag to true + validFlag = True + except FileNotFoundError: + import traceback + traceback.print_exc() +``` +[Exceptions] +1. If the resource release failure does not affect the subsequent program behavior, the exception that occurs during resource release can be suppressed. Examples of releasing resources include closing files, network sockets, threads, and so on. These resources are usually released in the except or fianlly block and will not be used during subsequent program operation. Therefore, unless the resources are exhausted, these exceptions cannot affect the subsequent behavior of the program. When the resource exhaustion is resolved, you only need to purify the exceptions and record logs (for future improvement). In this case, there is no need to handle other errors. +2. If it is impossible to recover from an exception at a specific abstraction level, the code at that level does not need to handle the exception. Instead, the code at that level should throw an appropriate exception so that higher-level code can catch the exception and attempt to recover it. In this case, the most common implementation method is to omit the catch statement block and allow the exception to be broadcast. + +##### Rule 2.3.2 When using try…except… to protect the code, use finally… to ensure that operation objects are released after an exception occurs. + +When using try…except… to protect the code, if an exception occurs during code execution, use finally… to ensure that operation objects can be released. + +[Compliant Code Example] +```python + handle = open(r"/tmp/sample_data.txt") # May raise IOError + try: + data = handle.read() # May raise UnicodeDecodeError + except UnicodeDecodeError as decode_error: + print(decode_error) + finally: + handle.close() # Always run after try: +``` + +##### Rule 2.3.3 Do not capture all exceptions by executing the "except:" statement. + +Note that Python is tolerant of exceptions. The "except:" statement can capture any errors, including those in Python syntax. Executing the "except:" statement hides potential bugs. Therefore, specify exceptions to be handled when using try…except… to protect the code. The Exception class is the base class of most runtime exceptions and should not be used in the "except:" statement. The "try" statement should contain only exceptions that must be handled at the current location. The "except:" statement should only capture exceptions that must be handled. For example, for the code for opening files, the "try" statement should contain only the "open" statement. The "except:" statement only captures the FileNotFoundError exceptions. Other unexpected exceptions are captured by functions in the upper layer, or are transparently transmitted to external programs for exposure. + +[Noncompliant Code Example] +Two types of exceptions may occur in the following code. When executing the "except:" statement for unified handle, if exceptions occur in the open statement execution, and the "except:" statement handle is invalid, the close method will be called and an error that the reported handle is undefined will be reported. +```python + try: + handle = open(r"/tmp/sample_data.txt") # May raise IOError + data = handle.read() # May raise UnicodeDecodeError + except: + handle.close() +``` +[Compliant Code Example] +```python + try: + handle = open(r"/tmp/sample_data.txt") # May raise IOError + try: + data = handle.read() # May raise UnicodeDecodeError + except UnicodeDecodeError as decode_error: + print(decode_error) + finally: + handle.close() + except(FileNotFoundError, IOError) as file_open_except: + print(file_open_except) +``` + +##### Rule 2.3.4 The raise keyword that is not contained in the "except:" statement must have exceptions specified. + +**Note**: The raise keyword can be used only in the "try-except" statement and re-throw exceptions captured by the "except:" statement. + +[Noncompliant Code Example] + +```python + a = 1 + if a==1: + raise +``` +[Compliant Code Example 1] Raise an exception or a custom exception. +```python + a = 1 + if a==1: + raise Exception +``` + +[Compliant Code Example 2] Use the raise keyword in the "try-except" statement. +```python + try: + f = open('myfile.txt') + s = f.readline() + i = int(s.strip()) + except IOError as e: + print("I/O error({0}): {1}".format(e.errno, e.strerror)) + except ValueError: + print("Could not convert data to an integer.") + except Exception: + print("Unexpected error:", sys.exc_info()[0]) + raise +``` + +#### 2.4 Serialization and Deserialization + +##### Rule 2.4.1 When pickle has security risks, do not use the pickle.load, cPickle.load, or shelve module to load untrusted data. + +##### Rule 2.4.2 Use secure random numbers. + +Python implements the random number generation function in the random module, and implements various distributed pseudo-random number generators. The generated random numbers can be +evenly distributed, in Gaussian distribution, logarithmic normal distribution, negative exponential distribution, alpha distribution, or beta distribution manners. However, these random numbers are pseudo-random numbers, and +cannot be used for applications requiring security encryption. +Use /dev/random to generate secure random numbers, or use the secrets module introduced by Python 3.6 to generate secure random numbers. + +[Noncompliant Code Example] + +```python + import random + # Pseudo-random numbers + func = random.SystemRandom() + print(func.random()) + print(func.randint(0, 10)) +``` + +[Compliant Code Example] + +```python + import platform + # For details about the length, see the cryptographic algorithm specifications. The length varies according to the scenario. + randLength = 16 + if platform.system() == 'Linux': + with open("/dev/random", 'rb') as file: + sr = file.read(randLength) + print(sr) +``` + +##### Rule 2.4.3 The assert statement is usually used only in test code. Do not include the assert function in released versions. + +The assert statement is used only for internal tests during R&D. If AssertionError occurs, it indicates that errors exist in software design or the code. +The software should be modified to resolve this issue. Do not include the assert function in externally released versions for production. \ No newline at end of file