cathpy.util

from cathpy import util

General utility classes and functions

class cathpy.util.AlignmentSummary(*, path, dops, aln_length, seq_count, gap_count, total_positions)

Stores summary information about an alignment.

class cathpy.util.AlignmentSummaryRunner(*, aln_dir=None, aln_file=None, suffix='.sto', skipempty=False)

Provides a summary report for sequence alignment files.

Parameters:
  • aln_dir – input alignment directory
  • aln_file – input alignment file
  • suffix – filter alignments by suffix
  • skipempty – skip empty files
class cathpy.util.FunfamFileFinder(base_dir, *, ff_tmpl='__SFAM__-ff-__FF_NUM__.sto')

Finds a Funfam alignment file within a directory.

funfam_id_from_file(ff_file)

Extracts a FunfamID from the file name (based on the ff_tmpl)

search_by_domain_id(domain_id)

Return the filename of the FunFam alignment containing the domain id.

class cathpy.util.GroupsimResult(*, scores=None)

Represents the result from running the groupsim algorithm.

count_positions

Returns the number of positions in the groupsim result.

classmethod new_from_file(gs_file)

Create a new groupsim result from an output file.

classmethod new_from_io(gs_io, *, maxscore=1)

Create a new groupsim result from an io source.

class cathpy.util.GroupsimRunner(*, groupsim_dir='/home/docs/checkouts/readthedocs.org/user_builds/cathpy/envs/stable/lib/python3.7/site-packages/cathpy-0.1.4-py3.7.egg/cathpy/tools/GroupSim', python2path='python2', column_gap=0.3, group_gap=0.5)

Object that provides a wrapper around groupsim.

run_alignment(alignment, *, column_gap=None, group_gap=None, mclachlan=False)

Runs groupsim against a given alignment.

class cathpy.util.ScoreconsResult(*, dops, scores)

Represents the results from running scorecons.

to_string

Returns the scorecons results as a string (one char per position).

class cathpy.util.ScoreconsRunner(*, scorecons_path='/home/docs/checkouts/readthedocs.org/user_builds/cathpy/envs/stable/lib/python3.7/site-packages/cathpy-0.1.4-py3.7.egg/cathpy/tools/linux-x86_64/scorecons', matrix_path='/home/docs/checkouts/readthedocs.org/user_builds/cathpy/envs/stable/lib/python3.7/site-packages/cathpy-0.1.4-py3.7.egg/cathpy/tools/data/PET91mod.mat2')

Runs scorecons for a given alignment.

run_alignment(alignment)

Runs scorecons on a given alignment.

run_fasta(fasta_file)

Returns scorecons data (ScoreconsResult) for the provided FASTA file.

Returns:scorecons result
Return type:result (ScoreconsResult)
run_stockholm(sto_file)

Returns scorecons data for the provided STOCKHOLM file.

Returns:scorecons result
Return type:result (ScoreconsResult)
class cathpy.util.StructuralClusterMerger(*, cath_version, sc_file, ff_dir, out_fasta=None, out_sto=None, ff_tmpl='__SFAM__-ff-__FF_NUM__.sto', add_groupsim=True, add_scorecons=True, cath_release=None)

Merges FunFams based on a structure-based alignment of representative sequences.

Parameters:
  • cath_version – version of CATH
  • sc_file – structure-based alignment (*.fa) of funfam reps
  • ff_dir – path of the funfam alignments (*.sto) to merge
  • out_fasta – file to write merged alignment (FASTA)
  • out_sto – file to write merged alignment (STOCKHOLM)
  • ff_tmpl – template used to find the funfam alignment files
  • add_groupsim – add groupsim data (default: True)
  • add_scorecons – add scorecons data (default: True)
  • cath_release – specify custom release data directory