INDEX
Explanations
references to tangible goods or resources
New Auto-Interp
Negative Logits
xon
-0.90
entin
-0.77
insky
-0.70
rams
-0.70
nces
-0.69
acket
-0.68
alach
-0.68
instein
-0.67
bats
-0.66
igious
-0.66
POSITIVE LOGITS
istic
0.78
ize
0.68
istically
0.68
ocent
0.68
ista
0.65
izable
0.65
opolis
0.63
istas
0.63
opsy
0.61
ise
0.61
Activations Density 0.016%