INDEX
Explanations
the word "ci" with high activations
references to the field of science and its sub-disciplines
New Auto-Interp
Negative Logits
locked
-0.74
stage
-0.69
stood
-0.68
rooms
-0.67
ãĥī
-0.65
Malays
-0.64
lain
-0.64
GOODMAN
-0.64
boards
-0.63
ton
-0.62
POSITIVE LOGITS
ère
0.95
ences
0.92
enza
0.91
zona
0.86
ennes
0.86
encia
0.84
aches
0.83
otti
0.83
pe
0.83
obile
0.82
Activations Density 0.022%