當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Py之pandas：pandas的read_excel()函数中各参数说明及函数使用方法讲解

發布時間：2025/3/21 编程问答 34 豆豆

生活随笔收集整理的這篇文章主要介紹了 Py之pandas：pandas的read_excel()函数中各参数说明及函数使用方法讲解小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Py之pandas：pandas的read_excel()函數中各參數說明及函數使用方法講解

pandas的read_excel()函數中各參數說明及函數使用方法講解

read_excel()函數實現功能

read_excel()函數使用方法

1、可以使用文件名作為字符串或打開文件對象來讀取文件:

2、索引和標頭可以通過index_col和標頭參數指定

3、列類型是推斷式的，但可以顯式指定

4、True、False和NA值以及數千個分隔符都有默認值，但也可以顯式指定。提供您想要的值作為字符串或字符串列表！

read_excel()函數中各參數具體說明

pandas的read_excel()函數中各參數說明及函數使用方法講解

read_excel()函數實現功能

? ? ? ?將一個Excel文件讀入一個pandas數據文件夾。支持從本地文件系統或URL讀取的xls、xlsx、xlsm、xlsb、odf、ods和odt文件擴展名。支持讀取單個工作表或工作表列表的選項。

read_excel()函數使用方法

1、可以使用文件名作為字符串或打開文件對象來讀取文件:

pd.read_excel('tmp.xlsx', index_col=0) Name Value 0 string1 1 1 string2 2 2 #Comment 3 pd.read_excel(open('tmp.xlsx', 'rb'),sheet_name='Sheet3') Unnamed: 0 Name Value 0 0 string1 1 1 1 string2 2 2 2 #Comment 3

2、索引和標頭可以通過index_col和標頭參數指定

pd.read_excel('tmp.xlsx', index_col=None, header=None) 0 1 2 0 NaN Name Value 1 0.0 string1 1 2 1.0 string2 2 3 2.0 #Comment 3

3、列類型是推斷式的，但可以顯式指定

pd.read_excel('tmp.xlsx', index_col=0,dtype={'Name': str, 'Value': float}) Name Value 0 string1 1.0 1 string2 2.0 2 #Comment 3.0

4、True、False和NA值以及數千個分隔符都有默認值，但也可以顯式指定。提供您想要的值作為字符串或字符串列表！

pd.read_excel('tmp.xlsx', index_col=0,na_values=['string1', 'string2']) Name Value 0 NaN 1 1 NaN 2 2 #Comment 3

read_excel()函數中各參數具體說明

官方API：pandas.read_excel

def read_excel Found at: pandas.io.excel._base

@deprecate_nonkeyword_arguments(allowed_args=2, version="2.0")
@Appender(_read_excel_doc)
def read_excel( ? ?io, ? ? ?sheet_name=0, ? ? ?header=0, ? ? ?names=None, ? ? ?index_col=None, ? ? ?usecols=None, ? ? ?squeeze=False, ? ? ?dtype=None, ? ? ?engine=None, ? ? ?converters=None, ? ? ?true_values=None, ? ? ?false_values=None, ? ? ?skiprows=None, ? ? ?nrows=None, ? ? ?na_values=None, ? ? ?keep_default_na=True, ? ? ?na_filter=True, ? ? ?verbose=False, ? ? ?parse_dates=False, ? ? ?date_parser=None, ? ? ?thousands=None, ? ? ?comment=None, ? ? ?skipfooter=0, ? ? ?convert_float=True, ? ? ?mangle_dupe_cols=True):

io	str, bytes, ExcelFile, xlrd.Book, path object, or file-like object Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.xlsx. ?If you want to pass in a path object, pandas accepts any os.PathLike. ?By file-like object, we refer to objects with a read() method, such as a file handler (e.g. via builtin open function) or StringIO.	str, bytes, ExcelFile, xlrd.Book, path object, , or file-like object 任何有效的字符串路徑。字符串可以是URL。有效的URL方案包括http、ftp、s3和file。對于文件url，需要一個主機。本地文件可以是:file://localhost/path/to/table.xlsx。如果您想傳入一個path對象，pandas會接受任何類似os. path的東西。通過類文件對象，我們使用read()方法引用對象，比如文件處理程序(例如通過內置的open函數)或StringIO。
sheet_name	str, int, list, or None, default 0 Strings are used for sheet names. Integers are used in zero-indexed sheet positions. Lists of strings/integers are used to request multiple sheets. Specify None to get all sheets. Available cases: Defaults to?0: 1st sheet as a?DataFrame 1: 2nd sheet as a?DataFrame "Sheet1": Load sheet with name “Sheet1” [0,?1,?"Sheet5"]: Load first, second and sheet named “Sheet5” as a dict of?DataFrame None: All sheets.	表格名稱使用str、int、list或None等默認0字符串。整數用于零索引的工作表位置。字符串/整數列表用于請求多個表。指定None以獲取所有表。可用情況:默認為0:第1張表作為DataFrame1:第2張表作為DataFrame“Sheet1”:加載名稱為“Sheet1”的表[0,1，“Sheet5”]:首先加載，第二張和名為“Sheet5”的表作為DataFrameNone的dict:所有表。 sheet_name=sheetname_ID, 指定要讀入的sheet名
header	int, list of int, default 0 Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a?MultiIndex. Use None if there is no header.	行(0索引)，用于已解析的數據格式的列標簽。如果傳遞了一個整數列表，那么這些行位置將合并到一個多索引中。如果沒有標題，則使用None。
names	array-like, default None List of column names to use. If file contains no header row, then you should explicitly pass header=None.	要使用的列名的列表。如果文件不包含頭行，那么應該顯式傳遞header=None。使用names參數時，類似重命名。切記，要與原數據全部匹配：即不能多也不能少，多或者少都會拋出ValueError錯誤。
index_col	int, list of int, default None Column (0-indexed) to use as the row labels of the DataFrame. Pass None if there is no such column. If a list is passed, those columns will be combined into a?MultiIndex. If a subset of data is selected with?usecols, index_col is based on the subset.	Column(0索引)用作數據格式的行標簽。如果沒有這樣的列，則傳遞None。如果傳遞了一個列表，那么這些列將被合并到一個多索引中。如果使用usecols選擇了數據子集，則index_col基于該子集。 index_col=0,? ? ? #不讀取第一索引列
usecols	int, str, list-like, or callable default None If None, then parse all columns. If str, then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides. If list of int, then indicates list of column numbers to be parsed. If list of string, then indicates list of column names to be parsed. New in version 0.24.0. If callable, then evaluate each column name against it and parse the column if the callable returns?True. Returns a subset of the columns according to behavior above. New in version 0.24.0.	如果沒有，那么解析所有列。如果str，則表示用逗號分隔的Excel列字母和列范圍列表(如“A:E”或“A,C,E:F”)。范圍包括兩邊。如果list of int，則指示要解析的列號列表。如果字符串列表，則指示要解析的列名列表。新版本為0.24.0。 usecols=[1,2,7,8,14] usecols參數，指定要讀入列的索引ID
squeeze	bool, default False If the parsed data only contains one column then return a Series.	如果解析的數據只包含一列，則返回一個序列。
dtype	Type name or dict of column -> type, default None Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32} Use?object?to preserve data as stored in Excel and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.	數據或列的數據類型。例如{a: np。使用object保存存儲在Excel中的數據，而不是解釋dtype。如果指定了轉換器，則將應用它們而不是dtype轉換。
engine	str, default None If io is not a buffer or path, this must be set to identify io. Supported engines: “xlrd”, “openpyxl”, “odf”, “pyxlsb”, default “xlrd”. Engine compatibility : - “xlrd” supports most old/new Excel file formats. - “openpyxl” supports newer Excel file formats. - “odf” supports OpenDocument file formats (.odf, .ods, .odt). - “pyxlsb” supports Binary Excel files.	如果io不是緩沖區或路徑，則必須將其設置為識別io。支持的引擎:“xlrd”、“openpyxl”、“odf”、“pyxlsb”，默認為“xlrd”。引擎兼容性:-“xlrd”支持大多數新舊Excel文件格式。- " openpyxl "支持較新的Excel文件格式。-“odf”支持OpenDocument文件格式(。odf。ods, .odt)。- " pyxlsb "支持二進制Excel文件。
converters	dict, default None Dict of functions for converting values in certain columns. Keys can either be integers or column labels, values are functions that take one input argument, the Excel cell content, and return the transformed content.	用于轉換某些列中的值的函數的字典。鍵可以是整數也可以是列標簽，值是接受一個輸入參數Excel單元格內容并返回轉換后內容的函數。
true_values	list, default None Values to consider as True.	true_values
false_values	list, default None Values to consider as False.	false_values
skiprows	list-like Rows to skip at the beginning (0-indexed).	開頭要跳過的行(0索引)。
nrows	int, default None Number of rows to parse. New in version 0.23.0.	要解析的行數。新版本0.23.0。
na_values	scalar, str, list-like, or dict, default None Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: ‘’, ‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’, ‘-NaN’, ‘-nan’, ‘1.#IND’, ‘1.#QNAN’, ‘<NA>’, ‘N/A’, ‘NA’, ‘NULL’, ‘NaN’, ‘n/a’, ‘nan’, ‘null’.	附加的弦可以像NA/NaN那樣識別。如果命中注定，具體的評估。例如:“‘N/A’、‘N/A’、‘N/A’、‘NA’、‘-1’。”# IND”、“錄音。QNAN '， ' -NaN '， ' -NaN '， ' 1。# IND”、“1。# QNAN NA系”、“< >”、“N / A”、“NA”、“空”、“南”、“N / A‘南’,‘空’。
keep_default_na	bool, default True Whether or not to include the default NaN values when parsing the data. Depending on whether?na_values?is passed in, the behavior is as follows: If?keep_default_na?is True, and?na_values?are specified,?na_values?is appended to the default NaN values used for parsing. If?keep_default_na?is True, and?na_values?are not specified, only the default NaN values are used for parsing. If?keep_default_na?is False, and?na_values?are specified, only the NaN values specified?na_values?are used for parsing. If?keep_default_na?is False, and?na_values?are not specified, no strings will be parsed as NaN. Note that if?na_filter?is passed in as False, the?keep_default_na?and?na_values?parameters will be ignored.	解析數據時是否包含默認的NaN值。根據是否傳入na_values，行為如下: 如果keep_default_na為真，并且指定了na_values，那么na_values將附加到用于解析的缺省NaN值中。如果keep_default_na為真，并且沒有指定na_values，則只使用默認的NaN值進行解析。如果keep_default_na為False，并且指定了na_values，則僅使用指定na_values的NaN值進行解析。如果keep_default_na為False，并且沒有指定na_values，則不會將任何字符串解析為NaN。注意，如果將na_filter作為False傳入，則keep_default_na和na_values參數將被忽略。
na_filter	bool, default True Detect missing value markers (empty strings and the value of na_values). In data without any NAs, passing na_filter=False can improve the performance of reading a large file.	檢測缺失的值標記(空字符串和na_values的值)。在沒有NAs的數據中，傳遞na_filter=False可以提高讀取大文件的性能。
verbose	bool, default False Indicate number of NA values placed in non-numeric columns.	指示放置在非數字列中的NA值的數目。
parse_dates	bool, list-like, or dict, default False The behavior is as follows: bool. If True -> try parsing the index. list of int or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column. list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column. dict, e.g. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’ If a column or index contains an unparseable date, the entire column or index will be returned unaltered as an object data type. If you don`t want to parse some cells as date just change their type in Excel to “Text”. For non-standard datetime parsing, use?pd.to_datetime?after?pd.read_excel. Note: A fast-path exists for iso8601-formatted dates.	其行為如下: bool類型：如果為真——>嘗試解析索引。 int或名稱的列表。例如，If[1,2,3] ->嘗試將1,2,3列分別解析為一個單獨的日期列。 list類型：例如，If[[1,3]] ->組合列1和3并解析為單個日期列。 dict類型：例如{' foo ':[1,3]} ->解析列1,3作為日期并調用結果' foo ' 如果列或索引包含不可解析的日期，則整個列或索引將作為對象數據類型不變地返回。如果你不想把一些單元格解析為date，那就把它們在Excel中的類型改為Text。對于非標準的日期時間解析，在pd.read_excel后面使用pd.to_datetime。注意:有一個用于iso8601格式的日期的快速路徑。
date_parser	function, optional Function to use for converting a sequence of string columns to an array of datetime instances. The default uses?dateutil.parser.parser?to do the conversion. Pandas will try to call?date_parser?in three different ways, advancing to the next if an exception occurs: 1) Pass one or more arrays (as defined by?parse_dates) as arguments; 2) concatenate (row-wise) the string values from the columns defined by?parse_dates?into a single array and pass that; and 3) call?date_parser?once for each row using one or more strings (corresponding to the columns defined by?parse_dates) as arguments.	該函數，用于將字符串列序列轉換為日期時間實例數組。默認使用dateutil.parser。解析器執行轉換。熊貓將嘗試以三種不同的方式調用date_parser，如果出現異常，則繼續調用:1)傳遞一個或多個數組(由parse_date定義)作為參數;2)將parse_date定義的列中的字符串值連接到一個數組中并傳遞它;使用一個或多個字符串(對應于parse_date定義的列)作為參數，對每一行調用date_parser一次。
thousands	str, default None Thousands separator for parsing string columns to numeric. Note that this parameter is only necessary for columns stored as TEXT in Excel, any numeric columns will automatically be parsed, regardless of display format.	數以千計的分隔符用于將字符串列解析為數字。請注意，此參數僅對存儲為文本的列在Excel中是必要的，任何數值列都將自動解析，無論顯示格式如何。 ?
comment	str, default None Comments out remainder of line. Pass a character or characters to this argument to indicate comments in the input file. Any data between the comment string and the end of the current line is ignored.	注釋掉行中的余數。向此參數傳遞一個或多個字符，以指示輸入文件中的注釋。注釋字符串和當前行結束之間的任何數據都將被忽略。
skipfooter	int, default 0 Rows at the end to skip (0-indexed).	末尾要跳過的行(0索引)。
convert_float	bool, default True Convert integral floats to int (i.e., 1.0 –> 1). If False, all numeric data will be read in as floats: Excel stores all numbers as floats internally.	將整型浮點數轉換為整型浮點數(例如，1.0 - > - 1)，如果為False，則所有數值數據將以浮點數的形式讀入:Excel在內部將所有數字存儲為浮點數。
mangle_dupe_cols	bool, default True Duplicate columns will be specified as ‘X’, ‘X.1’, …’X.N’, rather than ‘X’…’X’. Passing in False will cause data to be overwritten if there are duplicate names in the columns.	重復列將被指定為' X '， ' X。1 ',…”X。是N，而不是X，是X。如果列中有重復的名稱，傳入False將導致數據被覆蓋。
Returns	DataFrame or dict of DataFrames DataFrame from the passed in Excel file. See notes in sheet_name argument for more information on when a dict of DataFrames is returned.	DataFrame從傳遞的Excel文件。請參閱sheet_name參數中的注釋，以獲得關于何時返回數據變量的更多信息。

總結

以上是生活随笔為你收集整理的Py之pandas：pandas的read_excel()函数中各参数说明及函数使用方法讲解的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：成功解决ValueError: `bin
下一篇： Python之pandas：pandas