INDEX
Explanations
references to specific categories or types of items
references to examples or types of things
New Auto-Interp
Negative Logits
olute
-0.68
YR
-0.68
inka
-0.62
somew
-0.61
onel
-0.61
eland
-0.58
olina
-0.58
enger
-0.57
orem
-0.57
ysical
-0.57
POSITIVE LOGITS
ties
0.85
ities
0.77
deals
0.67
Flag
0.63
sword
0.62
inyl
0.61
complex
0.61
Arg
0.60
gems
0.58
outsourcing
0.57
Activations Density 0.043%