Hi All,
I have 100's of files in the following format. I need to grep or parse out some values from each of the files
For example; if you look at the vector after placements:, "1, −2578.16, 0.777385, 0.004132, 0.0006" - the "1" represents B in the tree: line (which is the number inside the curly brackets). I just want to parse out the each of the Placement vectors in a tab delimited format along with the number matching the alphabet in the tree: line. Based on the above tree, here is the output I wanted:
Let me know the best way to parse this using awk or sed
I have 100's of files in the following format. I need to grep or parse out some values from each of the files
Code:
{
tree: ((A:0.2{0},B:0.09{1}):0.7{2},C:0.5{3}){4};,
placements:
[
{p: [[1, −2578.16, 0.777385, 0.004132, 0.0006], [0, −2580.15, 0.107065, 0.000009, 0.0153]], n: [fragment1]},
{p: [[3, −2576.46, 1.0, 0.003555, 0.000006]], n: [fragment2]}
],
metadata:
{invocation:
pplacer -c tiny.refpkg frags.fasta
},
version: 3,
fields:
[edge_num, likelihood, like_weight_ratio,
distal_length, pendant_length]
}
Code:
fragment1 B −2578.16 0.777385 0.004132 0.0006
fragment1 A −2580.15 0.107065 0.000009 0.0153
fragment2 C −2576.46, 1.0, 0.003555, 0.000006