INDEX
Explanations
terms related to peace and its quality
New Auto-Interp
Negative Logits
viſ
-0.74
pleaſure
-0.74
ſte
-0.71
itſelf
-0.69
houſe
-0.68
faſt
-0.65
ſta
-0.65
ſelf
-0.63
juſ
-0.61
ſmall
-0.59
POSITIVE LOGITS
Decent
0.49
laaj
0.48
ParallelGroup
0.48
decent
0.47
Decent
0.47
IBOutlet
0.46
feedback
0.46
guien
0.44
zeich
0.43
muun
0.43
Activations Density 0.285%