INDEX
Explanations
academic-related terms
repeated mentions of the term "academic."
New Auto-Interp
Negative Logits
Bundy
-0.85
oning
-0.70
Pod
-0.64
feeding
-0.64
...]
-0.64
slaughtered
-0.63
xon
-0.62
Species
-0.62
gger
-0.61
Prince
-0.61
POSITIVE LOGITS
ademic
1.17
academic
1.07
ulty
1.03
behavi
0.88
scholarly
0.87
academics
0.85
textbooks
0.84
essors
0.84
curric
0.84
referen
0.83
Activations Density 0.007%