INDEX
Explanations
phrases indicating a hypothetical situation with an element of comparison or speculation
New Auto-Interp
Negative Logits
aires
-0.68
idates
-0.64
moderation
-0.62
Regions
-0.60
vine
-0.59
Liberal
-0.59
cited
-0.58
Globe
-0.58
Cultural
-0.58
Progressive
-0.57
POSITIVE LOGITS
magically
0.77
icket
0.73
pi
0.73
rael
0.72
somehow
0.72
rehears
0.69
rir
0.68
conflic
0.68
Ãł
0.68
superhuman
0.68
Activations Density 0.222%