INDEX
Explanations
names enclosed in quotation marks
quotation marks and attributed dialogue in the text
New Auto-Interp
Negative Logits
populated
-0.82
permissions
-0.76
setting
-0.76
feminists
-0.75
appointments
-0.75
relate
-0.74
consumers
-0.74
defaults
-0.74
undertaking
-0.73
accessed
-0.73
POSITIVE LOGITS
Skip
1.34
Rocket
1.31
Wild
1.30
Doc
1.29
Fat
1.27
Big
1.23
Rust
1.22
Kid
1.21
Thunder
1.19
Bull
1.19
Activations Density 0.037%