Bergson Documentation¶
Bergson is a library for tracing the memory of deep neural nets with gradient-based data attribution techniques.
We provide options for analyzing models and datasets at any scale or level of granularity:
Compressed or uncompressed gradients.
Store gradients on-disk or process them in memory.
Accumulate queries following LESS and other strategies.
Query small gradient datasets on-GPU, and large ones using a sharded FAISS index.
Collect gradients during or after training.
Parallelize Bergson operations across multiple GPUs or nodes.
Load gradients with or without their module-wise structure.
Split attention module gradients by head.
Installation¶
pip install bergson
Quick Start¶
Build an index of gradients:
bergson build runs/quickstart --model EleutherAI/pythia-14m --dataset NeelNanda/pile-10k --truncation
Load the gradients:
from pathlib import Path
from bergson import load_gradients
gradients = load_gradients(Path("runs/quickstart"))
Benchmarks¶
Preprocessing¶
Experiments¶
API Reference¶
AttentionConfigAttributorBuilderCollectorComputerDataConfigDataConfig.chunk_lengthDataConfig.completion_columnDataConfig.conversation_columnDataConfig.data_kwargsDataConfig.datasetDataConfig.decode_into_subclassesDataConfig.format_templateDataConfig.prompt_columnDataConfig.reward_columnDataConfig.skip_nan_rewardsDataConfig.splitDataConfig.subsetDataConfig.truncation
FaissConfigFiniteDiffGradientCollectorGradientCollector.backward_hook()GradientCollector.builderGradientCollector.cfgGradientCollector.dataGradientCollector.mod_gradsGradientCollector.preprocess_cfgGradientCollector.process_batch()GradientCollector.scorerGradientCollector.setup()GradientCollector.skip_hessiansGradientCollector.teardown()
GradientProcessorGradientProcessor.hessiansGradientProcessor.hessians_eigenGradientProcessor.include_biasGradientProcessor.load()GradientProcessor.normalizersGradientProcessor.projection_dimGradientProcessor.projection_targetGradientProcessor.projection_typeGradientProcessor.reshape_to_squareGradientProcessor.save()
InMemoryCollectorInMemoryCollector.backward_hook()InMemoryCollector.builderInMemoryCollector.cfgInMemoryCollector.dataInMemoryCollector.gradientsInMemoryCollector.mod_gradsInMemoryCollector.preprocess_cfgInMemoryCollector.process_batch()InMemoryCollector.scorerInMemoryCollector.scoresInMemoryCollector.setup()InMemoryCollector.skip_hessiansInMemoryCollector.teardown()
IndexConfigIndexConfig.attentionIndexConfig.attribute_tokensIndexConfig.auto_batch_sizeIndexConfig.decode_into_subclassesIndexConfig.filter_modulesIndexConfig.force_math_sdpIndexConfig.include_biasIndexConfig.label_smoothingIndexConfig.loss_fnIndexConfig.loss_reductionIndexConfig.max_batch_sizeIndexConfig.modulesIndexConfig.optimizer_stateIndexConfig.partial_run_pathIndexConfig.profileIndexConfig.projection_dimIndexConfig.projection_targetIndexConfig.projection_typeIndexConfig.reshape_to_squareIndexConfig.skip_indexIndexConfig.split_attention_modulesIndexConfig.stats_sample_sizeIndexConfig.stream_shard_sizeIndexConfig.token_batch_size
PreprocessConfigQueryConfigScoreConfigScorerTokenGradientscollect_gradients()load_from_optimizer()load_gradient_dataset()load_gradients()load_token_gradients()mix_autocorrelation_matrices()
Experiments¶
Content Index¶
Documentation by Lucia Quirke.
If you have suggestions, questions, or would like to collaborate, please email lucia@eleuther.ai or drop us a line in the #data-attribution channel of the EleutherAI Discord!