添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

当使用for循环读取文件时,在某些情况下,我们只想读取特定的行,比如第26行和第30行,对于不同的情况,有3个内置特性可以实现这个目标。

When using a for loop to read a file, in some cases we only want to read specific lines, say line #26 and #30, there are 3 built-in features to achieve this goal for different cases.

For reading small files

对于小文件的快速解决办法:

Use fileobject.readlines() or for line in fileobject as a quick solution for small files.

f = open('filename')
lines=f.readlines()
print lines[25]
print lines[29]
lines = [25, 29]
i = 0
f = open('filename')
for line in f:
    if i in lines:
        print i
    i += 1

For reading many files, possible repeatedly

使用linecache是一个更优雅的解决方案,它可以快速读取许多文件,甚至可以重复读取。
There is a more elegant solution for extracting many lines: linecache

import linecache
linecache.getline('/etc/passwd', 4)
'sys:x:3:3:sys:/dev:/bin/sh\n'

将4改为想要的行号,就可以了。请注意,由于计数是从零开始的,所以第4行是第5行。

Change the 4 to your desired line number, and you’re on. Note that 4 would bring the fifth line as the count is zero-based.

For large files which won’t fit into memory

当文件非常大,而且无法放入内存时,用enumerate()。注意,使用此方法可能会变慢,因为文件是按顺序读取的。
If the file to read is big, and cause problems when read into memory or you don’t want to read the whole file in memory at once, it might be a good idea to use enumerate():

fp = open("file")
for i, line in enumerate(fp):
    if i == 25:
        # 26th line
    elif i == 29:
        # 30th line
    elif i > 29:
        break
fp.close()

Note that i == n-1 for the nth line.

In Python 2.6 or later:

with open("file") as fp:
    for i, line in enumerate(fp):
        if i == 25:
            # 26th line
        elif i == 29:
            # 30th line
        elif i > 29:
            break

整理并翻译自:stackoverflow

https://stackoverflow.com/questions/2081836/reading-specific-lines-only?answertab=active#tab-top

问题描述当使用for循环读取文件时,在某些情况下,我们只想读取特定的行,比如第26行和第30行,对于不同的情况,有3个内置特性可以实现这个目标。When using a for loop to read a file, in some cases we only want to read specific lines, say line #26 and #30, there are 3 bu...
python已经差不多有三个多月了,因为简洁,越来越喜欢这个"巨莽"了,我相信绝大多数人同样喜欢简洁。 今天第一次记录,是我刚刚再工作上遇到的一个小问题,为了更方便理解,我把问题概括成这样: 我有三百多万条记录,但是里面有重复(里面由数字和数字组成),我想要得到不重复的数据。
 file = open("test.txt", encoding="utf8")    #文档以utf8编码读取,不然默认gbk,中文会出现乱码  data = file. read()  data2 = file.read() print(data2) #结果为空,第一次读完指针就停留在末尾,第二次读接着上次的指针的位置,所以没有内容可以读取 默认打开是只读模式 fil...
主要流程:读取文件数据——将每一数据分成不同的字符段——在判断 在某个字否段是否含与某个字符。(只是其中一种办法)代码如下:with open(r"C:\Users\LENOVO\Desktop\20170513155231.txt", encoding='utf-8') as f:#从TXT文件中读出数据 for line1 in f: list.append(...
import pandas as pd df=pd.read_csv(r"C:\data\重复值处理\data1.csv",encoding='gbk',engine='python') #不写engine='python'可能会出现OSError: Initializing from file failed。 1.找出重复值的位置 找出重复值的位... ```python with open('filename.txt', 'r') as f: col2_list = [line.strip().split('\t')[1] for line in f] print(col2_list) 其中,filename.txt为待读取文件名,'\t'为文件中的分隔符,'[1]'表示获取第二列数据。