INDEX
Explanations
words related to physical locations
words related to time and duration
New Auto-Interp
Negative Logits
atform
-0.88
iege
-0.81
hips
-0.77
rams
-0.70
raphic
-0.68
s
-0.68
ram
-0.67
ohn
-0.67
oran
-0.67
obal
-0.66
POSITIVE LOGITS
lda
0.91
lled
0.91
arthed
0.88
ographed
0.88
lla
0.87
cki
0.86
urs
0.82
ño
0.78
phrine
0.77
hower
0.77
Activations Density 0.066%