MHCXGraph.utils.tools¶
- MHCXGraph.utils.tools.association_product(graphs_data: list, config: dict) dict[str, list] | None[source]¶
Compute the cross-protein association product.
This function orchestrates the full multi-protein association pipeline including triad detection, chunked combination, graph construction, and frame generation.
- MHCXGraph.utils.tools.build_graph_from_cross_combos(cross_combos) set[tuple[tuple[str, ...], tuple[str, ...]]][source]¶
Build graph edges from cross-protein triad combinations.
- MHCXGraph.utils.tools.build_threshold_vector(nodes, maps, threshold_cfg)[source]¶
Return the upper-triangular threshold vector instead of a full KxK matrix.
- MHCXGraph.utils.tools.convert_edges_to_residues(edges: set[frozenset], maps: dict) tuple[list, list, list][source]¶
Convert edge representations from node indices to residue labels.
- Parameters:
- Returns:
original_edges (list) – Original edge objects as provided in the input set.
edges_indices (list[tuple]) – Edge representation using tuples of node indices.
converted_edges (list[tuple]) – Edge representation where node indices are converted to residue labels of the form
"CHAIN:RESNAME:RESNUM".
- MHCXGraph.utils.tools.create_coherent_matrices(nodes, matrices: dict, maps: dict, threshold: float | dict = 3.0)[source]¶
Compute coherence matrices across proteins using a memory-efficient streaming approach.
- Parameters:
- Returns:
new_matrices (dict) – Dictionary containing coherence masks and standard deviation matrices.
maps (dict) – Updated node mapping dictionary.
- MHCXGraph.utils.tools.create_graph(edges_dict: dict, typeEdge: str = 'edges_indices', comp_id=0, *, edge_std_matrix: ndarray | None = None, node_index_map: dict[Any, int] | None = None)[source]¶
Construct NetworkX graphs from frame edge definitions.
- Parameters:
edges_dict (dict) – Frame dictionary containing edge definitions.
typeEdge (str, default="edges_indices") – Key specifying which edge representation to use.
comp_id (int, default=0) – Component identifier used for logging.
edge_std_matrix (ndarray, optional) – Matrix of edge standard deviations used for visualization.
node_index_map (dict, optional) – Mapping between node identifiers and matrix indices.
- Returns:
graphs – List of constructed NetworkX graphs.
- Return type:
list[networkx.Graph]
- MHCXGraph.utils.tools.cross_protein_triads(step_idx, chunk_idx, triads_per_protein, diff, check_distances=True)[source]¶
Generate cross-protein combinations of compatible triads.
- Parameters:
step_idx (int) – Current hierarchical association step.
chunk_idx (int) – Index of the chunk being processed.
triads_per_protein (list[dict]) – List of triad dictionaries for each protein.
diff (float) – Maximum allowed distance difference across proteins.
check_distances (bool, default=True) – If True, distance bounds are used to filter candidate triad combinations.
- Returns:
cross – Dictionary describing cross-protein triad combinations.
- Return type:
- MHCXGraph.utils.tools.execute_step(step_idx: int, graph_collection, max_chunks: int, current_filtered_cross_combos, graphs_data, global_state, residue_tracker)[source]¶
Execute a single hierarchical association step.
- Parameters:
step_idx (int) – Current step index.
graph_collection (dict) – Graph collection produced during preprocessing.
max_chunks (int) – Maximum chunk size used for hierarchical grouping.
current_filtered_cross_combos (list) – Cross-combo results from the previous step.
graphs_data (list) – Graph metadata structures.
global_state (dict) – Shared global state containing matrices, maps, and configuration parameters.
residue_tracker (ResidueTracker, optional) – Tracking object used for debugging and provenance logging.
- Returns:
filtered_cross_combos (list) – Filtered cross combinations for the next step.
step_graphs (list) – Graphs produced during the step.
- MHCXGraph.utils.tools.filter_maps_by_nodes(data: dict, matrices_dict: dict, distance_threshold: float = 10.0) tuple[dict, dict][source]¶
Filter contact and RSA maps according to graph nodes.
- Parameters:
- Returns:
matrices_dict (dict) – Updated dictionary containing pruned and thresholded matrices.
maps (dict) – Mapping structure describing residue indices and filtered residue maps.
- MHCXGraph.utils.tools.find_class(classes: dict[str, dict[str, float]], value: float)[source]¶
Find class intervals that contain a numeric value.
- Parameters:
- Returns:
class_name – Name of the matching class, a list of classes if multiple intervals match, or None if no interval contains the value.
- Return type:
- MHCXGraph.utils.tools.find_triads(graph_data, classes, config, checks, protein_index, tracker: ResidueTracker | None = None)[source]¶
Identify residue triads within a protein interaction graph.
- Parameters:
graph_data (dict) – Graph metadata containing the graph object, contact map, RSA values, and residue mappings.
classes (dict) – Classification dictionaries defining bins for residues, distances, or solvent accessibility.
config (dict) – Association configuration controlling thresholds, discretization, and filtering rules.
checks (dict) – Dictionary controlling optional filters such as RSA checks.
protein_index (int) – Index of the protein currently being processed.
tracker (ResidueTracker, optional) – Tracking object used for debugging and provenance recording of triad generation.
- Returns:
triads – Dictionary mapping triad tokens to metadata including counts and absolute triad instances.
- Return type:
- MHCXGraph.utils.tools.generate_frames(component_graph, matrices, maps, len_component, chunk_id, step, config, debug=False, debug_every=5000, nodes=None, steps_end=False, residue_tracker: ResidueTracker | None = None)[source]¶
Generate coherent structural frames from a component graph.
Frames correspond to coherent subgraphs satisfying distance and adjacency constraints.
- Parameters:
component_graph (networkx.Graph) – Graph component under analysis.
matrices (dict) – Coherence matrices and adjacency matrices.
maps (dict) – Residue mapping dictionary.
len_component (int) – Number of nodes in the component.
chunk_id (int) – Chunk identifier.
step (int) – Association step index.
config (dict) – Association configuration.
debug (bool, default=False) – Enable debug logging.
debug_every (int, default=5000) – Interval for progress logging during search.
nodes (list, optional) – Node ordering corresponding to matrix indices.
steps_end (bool, default=False) – If True, perform final frame filtering.
residue_tracker (ResidueTracker, optional) – Tracking object used for recording accepted frames.
- Returns:
frames (dict) – Dictionary describing generated frames.
union_graph (dict) – Graph representation combining all accepted frames.
- MHCXGraph.utils.tools.get_memory_usage_mb()[source]¶
Retorna uso de memória RSS em MB, se psutil estiver disponível. Caso contrário, retorna None.
- MHCXGraph.utils.tools.process_chunk(step_idx, chunk_idx, chunk_triads, global_state, residue_tracker)[source]¶
Process a chunk of triads during hierarchical association.
- Parameters:
step_idx (int) – Current association step.
chunk_idx (int) – Index of the chunk being processed.
chunk_triads (list) – Triad groups contained within the chunk.
global_state (dict) – Global state containing matrices, maps, and configuration.
residue_tracker (ResidueTracker, optional) – Tracking object used for recording intermediate states.
- Returns:
rebuilt_combos (dict or list or None) – Reconstructed combinations used in the next step.
final_graphs (list) – Graphs generated from the processed chunk.
- MHCXGraph.utils.tools.rebuild_cross_combos(cross_combos: dict[dict, list[tuple[tuple, ...]]], graph_nodes)[source]¶
Reconstruct cross-combo structures after graph pruning.
- MHCXGraph.utils.tools.sym_from_packed_float(k: int, packed: ndarray, fill_diag: float = nan) ndarray[source]¶
- MHCXGraph.utils.tools.triad_chirality_with_cb(ca_a: ndarray, ca_b: ndarray, ca_c: ndarray, cb_a: ndarray, cb_b: ndarray, cb_c: ndarray, *, weights: tuple[float, float, float] | None = None, outward_normal: ndarray | None = None, majority_only: bool = True) dict[str, Any][source]¶
Compute the chirality of a residue triad using Cα and Cβ atoms.
The method estimates a pose-invariant but mirror-sensitive chirality signature based on side-chain orientation relative to the triangle defined by three Cα atoms.
- Parameters:
ca_a (ndarray of shape (3,)) – Cartesian coordinates of Cα atoms.
ca_b (ndarray of shape (3,)) – Cartesian coordinates of Cα atoms.
ca_c (ndarray of shape (3,)) – Cartesian coordinates of Cα atoms.
cb_a (ndarray of shape (3,)) – Cartesian coordinates of Cβ atoms.
cb_b (ndarray of shape (3,)) – Cartesian coordinates of Cβ atoms.
cb_c (ndarray of shape (3,)) – Cartesian coordinates of Cβ atoms.
weights (tuple[float, float, float], optional) – Optional per-residue weights applied when averaging side-chain direction vectors.
outward_normal (ndarray of shape (3,), optional) – Reference outward direction used to orient side-chain vectors.
majority_only (bool, default=True) – If True, only side chains consistent with the majority orientation relative to the triangle normal contribute to the final direction.
- Returns:
result – Dictionary containing chirality information, including handedness bit, score, side-chain consistency, and intermediate geometric vectors.
- Return type:
- MHCXGraph.utils.tools.value_to_class(value: float, bin_width: float, threshold: float, diff_threshold: float, inverse: bool = False, upper_bound: float = 100.0, close_tolerance: float = 0.1) int | list[int] | None[source]¶
Assign a numeric value to one or more discretized bins.
- Parameters:
value (float) – Numeric value to classify.
bin_width (float) – Width of each bin interval.
threshold (float) – Boundary separating lower and upper classification domains.
inverse (bool, default=False) – If True, classification occurs in the range
[threshold, upper_bound].upper_bound (float, default=100.0) – Maximum allowed value in inverse classification mode.
close_tolerance (float, default=0.1) – Absolute tolerance used to detect values close to bin centers.
- Returns:
classes – Bin index or indices representing the classification result. Returns None if the value lies outside the allowed domain.
- Return type:
Functions¶
|
Compute the cross-protein association product. |
|
Build graph edges from cross-protein triad combinations. |
|
Return the upper-triangular threshold vector instead of a full KxK matrix. |
|
Convert edge representations from node indices to residue labels. |
|
Compute coherence matrices across proteins using a memory-efficient streaming approach. |
|
Construct NetworkX graphs from frame edge definitions. |
|
Generate cross-protein combinations of compatible triads. |
|
Execute a single hierarchical association step. |
|
Filter contact and RSA maps according to graph nodes. |
|
Find class intervals that contain a numeric value. |
|
Identify residue triads within a protein interaction graph. |
|
Generate coherent structural frames from a component graph. |
Retorna uso de memória RSS em MB, se psutil estiver disponível. |
|
|
|
|
Process a chunk of triads during hierarchical association. |
|
Reconstruct cross-combo structures after graph pruning. |
|
|
|
|
|
|
|
Compute the chirality of a residue triad using Cα and Cβ atoms. |
|
Assign a numeric value to one or more discretized bins. |