StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POCheck if files have same name and store line count of files with same names
primarykey
Id
18642039
data
AcceptedAnswerId
0
AnswerCount
2
ClosedDate
CommentCount
8
CommunityOwnedDate
CreationDate
2013-09-05T17:01:59.333
FavoriteCount
0
LastActivityDate
2013-09-06T16:54:21.537
LastEditDate
2013-09-06T16:43:46.627
LastEditorUserId
2589473
OwnerUserId
2589473
ParentId
0
PostTypeId
1
Score
0
ViewCount
716
LastEditorDisplayName
text
Body
I'm relatively new to Python and I could really use some of you guys input. I have a script running which stores files in the following format: <pre><code>201309030700__81.28.236.2.txt 201308240115__80.247.17.26.txt 201308102356__84.246.88.20.txt 201309030700__92.243.23.21.txt 201308030150__203.143.64.11.txt </code></pre> Each file has has some lines of codes which I want to count total of it and then I want to store this. For example, I want to go through these files, if a file has the same date (first part of the file name) then I want to store that in the same file in the following format. <pre><code>201309030700__81.28.236.2.txt has 10 lines 201309030700__92.243.23.21.txt has 8 lines </code></pre> Create a file with the date 20130903 (the last 4 digits are time I don't want that). Create file: 20130903.txt Which has two lines of codes 10 8 I have the following code but I'm not getting anywhere, please help. <pre><code>import os, os.path asline = [] ipasline = [] def main(): p = './results_1/' np = './new/' fd = os.listdir(p) run(fd) def writeFile(fd, flines): fo = np+fd+'.txt' with open(fo, 'a') as f: r = '%s\t %s\n' % (fd, flines) f.write(r) def run(path): for root, dirs, files in os.walk(path): for cfile in files: stripFN = os.path.splitext(cfile)[0] fileDate = stripFN.split('_')[0] fileIP = stripFN.split('_')[-1] if cfile.startswith(fileDate): hp = 0 for currentFile in files.readlines()[1:]: hp += 1 writeFile(fdate, hp) </code></pre> I tried to play around with this script: <pre><code>if not os.path.exists(os.path.join(p, y)): os.mkdir(os.path.join(p, y)) np = '%s%s' % (datetime.now().strftime(FORMAT), path) if os.path.exists(os.path.join(p, m)): os.chdir(os.path.join(p, month, d)) np = '%s%s' % (datetime.now().strftime(FORMAT), path) </code></pre> Where FORMAT has the following value <blockquote> 20130903 </blockquote> But I can't seem to get this to work. EDIT: I have modified the code as follows and it kinda does what I wanted to do but probably I'm doing things redundant and I still haven't taken into consideration that I'm processing huge number of files so maybe this isn't the most efficient way. Please have a look. <pre><code>import re, os, os.path p = './results_1/' np = './new/' fd = os.listdir(p) star = "*" def writeFile(fd, flines): fo = './new/'+fd+'_v4.txt' with open(fo, 'a') as f: r = '%s\n' % (flines) f.write(r) for f in fd: pathN = os.path.join(p, f) files = open(pathN, 'r') fileN = os.path.basename(pathN) stripFN = os.path.splitext(fileN)[0] fileDate = stripFN.split('_')[0] fdate = fileDate[0:8] lnum = len(files.readlines()) writeFile(fdate, lnum) files.close() </code></pre> At the moment it is writing to a file with new line for each number of lines counted on file. HOWEVER I have sorted this. I would appreciate some input, thank you very much. EDIT 2: Now I'm getting the output of each file with date as file name. The files now appear as: <pre><code>20130813.txt 20130819.txt 20130825.txt </code></pre> Each file now looks like: <pre><code>15 17 18 21 14 18 14 13 17 11 11 18 15 15 12 17 9 10 12 17 14 17 13 </code></pre> And it goes on for further 200+ lines each file. Ideally to now many times each occurrence happens and sorted with smallest number first would be the best desired outcome. I have tried something like: <pre><code>import sys from collections import Counter p = '.txt' d = [] with open(p, 'r') as f: for x in f: x = int(x) d.append(x) d.sort() o = Counter(d) print o </code></pre> Does this make sense? EDIT 3: I have the following script which count the unique for me but I'm still unable to sort by unique count. <pre><code>import os from collections import Counter p = './newR' fd = os.listdir(p) for f in fd: pathN = os.path.join(p, f) with open(pathN, 'r') as infile: fileN = os.path.basename(pathN) stripFN = os.path.splitext(fileN)[0] fileDate = stripFN.split('_')[0] counts = Counter(l.strip() for l in infile) for line, count in counts.most_common(): print line, count </code></pre> Has this the following results: <pre><code>14 291 15 254 12 232 13 226 17 212 16 145 18 127 11 102 10 87 19 64 21 33 20 24 22 15 9 15 23 9 30 6 60 3 55 3 25 3 </code></pre> The output should look like: <pre><code>9 15 10 87 11 102 12 232 13 226 14 291 etc </code></pre> What is the most efficient way of doing this?
Tags
<python><path>
Title
Check if files have same name and store line count of files with same names
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USChino
UserOwnerUserId
1. USChino
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POCheck if files have same name and store line count of files with same names
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTApproveEditSuggestion
2. VO
 singulars
 PostPostId
 POCheck if files have same name and store line count of files with same names
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTApproveEditSuggestion
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.