MHCXGraph.core.subgraphs.extract_surface_subgraph_rsa

MHCXGraph.core.subgraphs.extract_surface_subgraph_rsa(g: Graph, rsa_threshold: float = 0.2, inverse: bool = False, filter_dataframe: bool = True, recompute_distmat: bool = False, update_coords: bool = True, return_node_list: bool = False, *, treat_water_as_surface: bool = True, unknown_policy: str = 'skip', unknown_value: float | None = None) Graph | list[str] | None[source]

Select nodes by relative solvent accessibility (RSA).

Parameters:
  • g (nx.Graph) – Input graph. Nodes may carry ‘rsa’ in [0, 1].

  • rsa_threshold (float, default=0.2) – Minimum RSA to include.

  • inverse (bool, default=False) – If True, include RSA < threshold.

  • filter_dataframe (bool, default=True) – Filter graph-level DataFrames to subgraph nodes.

  • recompute_distmat (bool, default=False) – Recompute graph[‘dist_mat’] from pdb_df if available.

  • update_coords (bool, default=True) – Rebuild graph[‘coords’] from node attributes.

  • return_node_list (bool, default=False) – If True, return the resolved node list instead of a subgraph.

  • treat_water_as_surface (bool, default=True) – If True, nodes with residue name typical of water (e.g. HOH/WAT/DOD/TIP3) are treated as RSA=1.0 when ‘rsa’ is missing.

  • unknown_policy ({'skip', 'value', 'error'}, default='skip') – Behavior for nodes missing ‘rsa’ that are not water: - ‘skip’ : ignore node (do not include, do not raise); - ‘value’: use unknown_value as RSA; - ‘error’: raise ProteinGraphConfigurationError.

  • unknown_value (float, optional) – RSA value to use when unknown_policy=’value’.

Returns:

Subgraph or node list.

Return type:

nx.Graph or list of str

Raises:

ProteinGraphConfigurationError – If unknown_policy=’error’ and a node lacks ‘rsa’.