INDEX
Explanations
phrases related to actions or functions performed by objects or individuals
instances of entities or concepts described as playing active or representative roles
New Auto-Interp
Negative Logits
chev
-0.74
rants
-0.70
rone
-0.70
eor
-0.68
ties
-0.66
olds
-0.66
tten
-0.65
tom
-0.64
yd
-0.64
ulous
-0.63
POSITIVE LOGITS
intermediary
1.13
liaison
0.96
catalyst
0.94
deterrent
0.94
conduit
0.92
lookout
0.92
intermedi
0.91
condu
0.88
interpreter
0.88
surrogate
0.86
Activations Density 0.081%