INDEX
Explanations
references to specific locations or items related to food
New Auto-Interp
Negative Logits
Dyer
-0.55
ss
-0.54
amico
-0.51
AT
-0.51
inflater
-0.50
ec
-0.50
r
-0.49
StructEnd
-0.48
ert
-0.48
test
-0.48
POSITIVE LOGITS
pit
1.05
punch
0.95
Pit
0.93
Johnny
0.93
Pit
0.92
Johnny
0.91
صوتيه
0.91
bowl
0.87
pit
0.86
PIT
0.86
Activations Density 0.089%