INDEX
Explanations
words related to medical conditions and treatments
phrases related to absurdity or ridiculousness
New Auto-Interp
Negative Logits
levers
-0.82
allies
-0.68
coordin
-0.67
volunteers
-0.67
moderators
-0.67
contractors
-0.66
trustees
-0.66
pumps
-0.64
scanners
-0.64
narrowed
-0.64
POSITIVE LOGITS
ieval
1.08
reality
0.99
oreal
0.97
ocalyptic
0.96
history
0.96
astical
0.95
livion
0.95
ocide
0.94
antasy
0.91
ensical
0.90
Activations Density 0.434%