INDEX
Explanations
sentences related to arguments, debates, and discussions
phrases about the duality or contrast between different entities or ideas
New Auto-Interp
Negative Logits
Kings
-0.86
BBC
-0.85
Magic
-0.82
veyard
-0.76
Matthew
-0.74
Golden
-0.73
olitan
-0.73
Passenger
-0.71
Knight
-0.71
attery
-0.69
POSITIVE LOGITS
ÂŃ
1.44
stru
1.39
sup
1.31
includ
1.26
cov
1.20
Demo
1.18
dur
1.17
polit
1.13
prin
1.13
Amer
1.12
Activations Density 0.069%