INDEX
Explanations
occurrences of the text "SP" with varying activations
occurrences of the abbreviation "SP" with varying contexts
New Auto-Interp
Negative Logits
meal
-0.82
hran
-0.79
edin
-0.76
anchester
-0.73
tn
-0.68
oglu
-0.67
gebra
-0.63
tackle
-0.62
ornia
-0.61
Fr
-0.61
POSITIVE LOGITS
SP
3.70
SP
2.19
SPL
1.52
SPD
1.48
SPR
1.46
SM
1.40
RP
1.37
Sp
1.36
SL
1.35
SPI
1.34
Activations Density 0.014%