INDEX
Explanations
references to "thing" in various contexts and its implications
New Auto-Interp
Negative Logits
isticated
-0.82
***/
-0.75
-0.72
nexpected
-0.69
اولة
-0.68
letal
-0.68
']]
-0.68
Schatten
-0.67
"}
-0.67
."</
-0.67
POSITIVE LOGITS
thing
2.05
THING
1.92
Thing
1.84
Thing
1.65
thing
1.44
THING
1.42
thingy
1.23
coisa
1.13
cosa
0.99
thang
0.82
Activations Density 0.054%