Python regex, how to capture multiple rules from 1 string -


got quick question here regex. have file(testlog-date.log) has lines this

# 2014-04-09 16:43:15,136|pid: 1371|info|test.controller.root|finished processing request        in   0.003355s https://website/heartbeat 

i'm looking use regex capture pid , time. far have this

import re  file_handler = open("testlog-20140409.log", "r") line in file_handler:     var1 = re.findall(r'(\d+.\d+)s', line)     print var1 file_handler.close() 

so i'm able print process time..question how capture pid (and possibly other information variable var1? tried doing

var1 = re.findall(r'pid: (\d+) (\d+.\d+)s', line)  

it prints out empty structures.

much appreciated thanks!

followup: file quite large. i'm thinking of storing data 1 structure , sort them using process time, , print out top 20. idea how properly?

use regex (.*)\|(pid: .*)\|(.*)\|(.*)\|(.*). each parenthesis in regex pattern denotes separate group.

in [125]: text = '2014-04-09 16:43:15,136|pid: 1371|info|test.controller.root|finished processing request        in   0.003355s https://website/heartbeat' in [126]: pattern = re.compile(r'(.*)\|(pid: .*)\|(.*)\|(.*)\|(.*)') in [127]: results = re.findall(pattern, text) in [128]: results out[128]: [('2014-04-09 16:43:15,136',   'pid: 1371,   'info',   'test.controller.root',   'finished processing request        in   0.003355s https://website/heartbeat')] 

so have tuple each element belonging each of groups (timestamp, pid, routine, log level , log message.

edit

for large files, regex time consuming. log lines have '|' delimiter. can use split line.

all_lines = [] line in file:     all_lines.append(line.split('|')) 

this stores data list of lists:

[['2014-04-09 16:43:15,136','pid: 1371','info','test.controller.root','finished processing request        in   0.003355s https://website/heartbeat'], ..., ...] 

to sort all_lines can use sorted() function , pass first field of each of sub-lists comparator.

sorted_lines = sorted(all_lines, key=lambda x: x[0]) 

Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -