INDEX
Explanations
phrases related to beliefs and teachings
references to ideals and philosophies, particularly in a socio-political context
New Auto-Interp
Negative Logits
DR
-0.78
draw
-0.77
king
-0.77
GS
-0.72
upon
-0.71
Ward
-0.70
kes
-0.70
Query
-0.69
session
-0.69
de
-0.68
POSITIVE LOGITS
ideals
1.32
yip
0.80
Machina
0.74
urities
0.74
hip
0.74
cape
0.72
values
0.71
beliefs
0.70
Jinn
0.69
stances
0.69
Activations Density 0.013%