Skip to content
GitLab
Menu
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
bird_pipeline_registry
SRP-pipeline
Commits
9d604c6c
Commit
9d604c6c
authored
Jun 21, 2021
by
Eric CHARPENTIER
🐍
Browse files
updated make_ref to set gene symbol instead of ensg in sym2ref when using ensembl
parent
f5e1eb50
Changes
1
Hide whitespace changes
Inline
Side-by-side
SCRIPTS/make_ref.py
View file @
9d604c6c
...
...
@@ -278,12 +278,21 @@ def processEnsembl(fastaString, fastaOut, annotOut):
# Split track name by ' '. Transcript ID is on 1st field, gene symbol is on 7th field.
ls
=
line
.
decode
(
"utf-8"
).
split
(
" "
)
fastaMod
+=
ls
[
0
]
+
"
\n
"
geneField
=
1
geneSymbol
=
1
enst
=
1
found
=
False
for
i
in
range
(
1
,
len
(
ls
)):
if
(
ls
[
i
].
startswith
(
"gene:"
)):
geneField
=
i
enst
=
i
continue
if
(
ls
[
i
].
startswith
(
"gene_symbol:"
)):
geneSymbol
=
i
found
=
True
break
geneName
=
ls
[
geneField
].
split
(
":"
)[
1
].
rstrip
(
'
\n
'
)
if
(
not
found
):
geneName
=
ls
[
enst
].
split
(
":"
)[
1
].
rstrip
(
'
\n
'
)
else
:
geneName
=
ls
[
geneSymbol
].
split
(
":"
)[
1
].
rstrip
(
'
\n
'
)
if
(
not
geneName
in
gene2transcripts
):
gene2transcripts
[
geneName
]
=
set
()
gene2transcripts
[
geneName
].
add
(
ls
[
0
][
1
:])
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment