MHCXGraph.core.subgraphs.extract_surface_subgraph_rsa¶
- MHCXGraph.core.subgraphs.extract_surface_subgraph_rsa(g: Graph, rsa_threshold: float = 0.2, inverse: bool = False, filter_dataframe: bool = True, recompute_distmat: bool = False, update_coords: bool = True, return_node_list: bool = False, *, treat_water_as_surface: bool = True, unknown_policy: str = 'skip', unknown_value: float | None = None) Graph | list[str] | None[source]¶
Select nodes by relative solvent accessibility (RSA).
- Parameters:
g (nx.Graph) – Input graph. Nodes may carry ‘rsa’ in [0, 1].
rsa_threshold (float, default=0.2) – Minimum RSA to include.
inverse (bool, default=False) – If True, include RSA < threshold.
filter_dataframe (bool, default=True) – Filter graph-level DataFrames to subgraph nodes.
recompute_distmat (bool, default=False) – Recompute graph[‘dist_mat’] from pdb_df if available.
update_coords (bool, default=True) – Rebuild graph[‘coords’] from node attributes.
return_node_list (bool, default=False) – If True, return the resolved node list instead of a subgraph.
treat_water_as_surface (bool, default=True) – If True, nodes with residue name typical of water (e.g. HOH/WAT/DOD/TIP3) are treated as RSA=1.0 when ‘rsa’ is missing.
unknown_policy ({'skip', 'value', 'error'}, default='skip') – Behavior for nodes missing ‘rsa’ that are not water: - ‘skip’ : ignore node (do not include, do not raise); - ‘value’: use unknown_value as RSA; - ‘error’: raise ProteinGraphConfigurationError.
unknown_value (float, optional) – RSA value to use when unknown_policy=’value’.
- Returns:
Subgraph or node list.
- Return type:
- Raises:
ProteinGraphConfigurationError – If unknown_policy=’error’ and a node lacks ‘rsa’.