summaryrefslogtreecommitdiff
path: root/JOURNAL
blob: 55ad1ba8337f46a611c75a1e2c2000c6b25b6570 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
-*- mode: muse -*-

* 2007-10-10, 12:02:38 CEST

I've cleaned the Objective-C code up by making the NeXT and GNU
runtime-specific code converge a bit.  This also makes FIND-SELECTOR
return NIL for unknown selectors on the NeXT runtime, so compile-time
warnings about unknown methods are possible there now.  The latter
relies on sel_isMapped, whose semantics are not entirely clear to me.
On the one hand, Apple's reference manual states: “You can use this
function to determine whether a given address is a valid selector,”
which I interpret as meaning that it takes a selector pointer as an
argument, not a string.  On the other hand, in the preceding section,
the same document states: “You can still use the sel_isMapped function
to determine whether a method name is mapped to a selector.”

So if I have two strings that aren't the same under POINTER-EQ, but that
both name the same valid selector that is registered with the runtime,
like "self", say, does sel_isMapped work reliably in this case?  I'm not
sure.

On another note, I wonder what the difference between
sel_get_uid/sel_getUid and sel_register_name/sel_registerName might be.
They seem to do the same thing.

Maybe this whole #ifdef mess isn't even strictly necessary, anyway.  I
could just copy objc-gnu2next.h from the GNUstep project (LGPLv3, so the
licensing is fine).

http://svn.gna.org/svn/gnustep/libs/base/trunk/Headers/Additions/GNUstepBase/objc-gnu2next.h



* 2007-10-04, 17:27:02 CEST

** `char' Does Actually Indicate a Char, Sometimes

The latest changes made the test cases fail on GNUstep/x86, which either
means that the PyObjC code is wrong, or the GNU runtime has very weird
calling conventions that use ints as wrappers for chars or something.
Anyway, I have reverted the changes for GNUstep and left them in place
for Mac OS X (but note that I left the PyObjC code as it is, which means
that libffi is still directed to treats chars as ints).  As a result,
both NeXT/PowerPC and GNUstep/x86 work for now, but I'm uncertain about
the status of other architectures as well as calling methods with chars
and shorts as arguments, which I've got no test cases for.  I'm not
confident that either GNUstep/PowerPC/SPARC/whatever or NeXT/x86 work
the way my code expects them to.


* 2007-10-04, 16:52:32 CEST

** `char' Does Not Indicate a Char, Continued

There's a good chance that I've figured out what to do about the
char/int mess.  As it turns out, it isn't even limited to chars, as
shorts are affected, too.  According to the code I took from PyObjC,
specifically the typespec conversion functions in libffi_support.m, both
GNUstep and NeXT/PowerPC treat chars and shorts as ints.  The only
platform that isn't brain-damaged in this way seems to be NeXT/x86.  Or
maybe it's even more brain-damaged, as it treats shorts and chars
normally when they are used as arguments, but as ints when they're used
as return values!  At least GNUstep and NeXT/PowerPC are brain-damaged
in a *consistent* manner.

I figure the reason I never saw this problem in GNUstep is probably
endianness.  The little-endian x86 lets you treat pointers to ints as
pointers to chars without breaking anything, but that doesn't work in
big-endian machines.


* 2007-10-04, 13:02:31 CEST

** `char' Does Not Indicate a Char

In principle, the typespec "c" is supposed indicate a char.  Now look at
the following SLIME session transcript (SBCL/PowerPC on Mac OS X):

OBJECTIVE-CL> (defparameter *tmp*
                (invoke (find-objc-class 'ns-string)
                        :string-with-u-t-f-8-string "Mulk."))
*TMP*
OBJECTIVE-CL> (defparameter *tmp2*
                (invoke (find-objc-class 'ns-string)
                        :string-with-u-t-f-8-string "Mulk."))
*TMP2*
OBJECTIVE-CL> (second  ;return type specifier
               (multiple-value-list
                (retrieve-method-signature-info (find-objc-class 'ns-string)
                                                (selector :is-equal))))
"c"
OBJECTIVE-CL> (invoke *tmp* :is-equal *tmp2*)
0
OBJECTIVE-CL> (primitive-invoke *tmp* :is-equal :char *tmp2*)
0
OBJECTIVE-CL> (primitive-invoke *tmp* :is-equal :int *tmp2*)
1
OBJECTIVE-CL> (primitive-invoke *tmp* :is-equal :long *tmp2*)
1
OBJECTIVE-CL> (primitive-invoke *tmp* :is-equal :long-long *tmp2*)
4294967296

Now, I see why the last value is bogus (I'd be surprised if it weren't,
actually), but why the heck is the correct value (1, because, you see,
the strings *are* equal and +YES+ is 1 on my machine) returned only for
the wrong return type?  The return type is specified as `c', but it's
actually an int!  What's going on here?  And rather more importantly:
What can I do about this?  I don't feel exactly comfortable about
cheating and treating `c' as specifying an int on all systems based on
the NeXT runtime without having any indication about what else there is
in the NeXT runtime that has to be special-cased.  I haven't seen this
weird behaviour documented anywhere.  Even this specific case is
non-trivial, for I don't know whether this applies to all chars, or only
to chars that are booleans, or only to chars that are returned, or even
only to chars that are returned *and* are actually booleans.


* 2007-09-26, 00:13:11 CEST

** Licensing

Licensing is another open question.  For the moment, I'm releasing this
project under the terms of the GPLv3.  This seems like a reasonable
choice, because it gives me the option of giving people more permissions
later by applying the LGPLv3 to my code.  I must be aware that only I am
allowed to do this, though, and even then only if all contributors agree
(if someone actually makes a contribution, that is).  I may want to
require all contributors to dual-license their contributions, or maybe
to make them available under the terms of the LGPLv3 in the first place
(though the latter would make marking them difficult).


* 2007-09-25, 20:59:40 CEST

** Value Conversion Madness

Open question: Should NSArray instances be converted to lists or arrays
automatically?  If so, we ought to make functions like OBJC-CLASS-OF
behave in a reasonable way for those kinds of objects, i.e. return
NSArray or NSMutableArray (whatever it is that INVOKE makes out of them
when converting them into Objective-C instances again).

Note that we *must not* convert NSMutableArray instances or any other
mutable objects in this way!  Note also that our decision *must* be based
on the dynamic type of the object, not the static one, because a method
whose return type is NSArray may as well return an NSMutableArray that
we've fed it sometime earlier.  This is okay for immutable objects, but
mutable objects are bound to cause trouble when such a thing happens.

Related types of objects are strings (NSString), hash tables
(NSDictionary), and numbers (NSNumber).

Note that such behaviour would make it impossible to fully identify CLOS
classes with Objective-C classes, as arrays would have no Objective-C
class to belong to.  Then again, why would you want to distinguish
Objective-C arrays from Lisp arrays in your Lisp code, anyway?  Real
integration means not having to worry about such things.

On the other hand, conversion of large NSArrays may be prohibitively
expensive, so a switch is needed, either way.  The real question is what
the default behaviour should look like.

There's an alternative to consider, too.  For NSArrays, there is
Christophe Rhodes' user-extensible sequence proposal, but even without
support for that, we can provide a *conduit* (a package) that looks like
the COMMON-LISP package, but overloads all sequence and hash-table
functions.  Overloading all sequence functions might be a lot of work,
though.


* 2007-09-23, 17:09:07 CEST

** Improved Memory Management for the Masses

Up until now, the second-generation method invocation procedures
(LOW-LEVEL-INVOKE and PRIMITIVE-INVOKE) simply called MAKE-INSTANCE for
every object received from Objective-C, which meant that although a
lookup in the caching hash tables was done, method dispatch for
MAKE-INSTANCE was needed.  Therefore, everything just worked, but did so
slowly.

I realised yesterday, after having profiled the code and detected that
MAKE-INSTANCE method dispatch was the speed bottleneck of INVOKE calls
now, that overriding MAKE-INSTANCE wasn't really necessary for memory
management, as we could put instances into the hash tables and register
finalisers for them just after they were fully created.

So that's what I made the program do.  One of the results is much
shorter and clearer code, but the more interestng one is a speed
improvement of around the factor 3, making 100'000 calls to
NSMethodSignature#getArgumentTypeAtIndex:, which previously called
MAKE-INSTANCE for each returned value, take around 10s on my machine.
With the CFFI speed hack enabled, caching CFFI::PARSE-TYPE results, this
figure even goes down to around 2s (that's 50'000 method calls per
second).

I think that's pretty cool.  I'm quite satisfied with method invocation
performance now.  Compared to C, We're still off by a factor of 22 or so
(0.9s for 1'000'000 method calls).  Most of the time is spent on memory
allocation for argument passing and typespec strings.  By introducing a
global pool of preallocated memory spaces for these purposes (one
argument space per thread and maybe a bunch of string buffers, with a
fallback mechanism for method calls that take too much space), we might
be able to cut the run time by another factor of 5.  After that, we
can't optimise the Lisp code any further, because most of the rest of
the time is spent within the Objective-C function
objcl_invoke_with_types (or maybe in calling it via CFFI, which would be
even worse, optimisationwise).

It's probably best not to spend too much time pondering this, though,
because without the CFFI speed hack, the improvement would probably not
be noticeable, anyway (CFFI::PARSE-TYPE is most often called by
CFFI:MEM-REF and CFFI:MEM-AREF, not by the allocation routines).


** Milestones Lying Ahead

There are three things left to do that are showstoppers against actually
using Objective-CL productively.  One is support for structs.  This one
is actually quite a bit harder than it looks, because we don't
necessarily know the structure of foreign objects.  Objective-C tells us
about the structure (though not the member naming!) of structures as
well as pointers to structures that are returned by methods, but any
more indirection (that is, pointers to pointers to structs or something
even hairier) makes the Objective-C runtime conceal the internals of the
structs pointed to.  This is probably not a problem in practise, though,
as pointers to pointers to structs will usually mean a pointer that the
user may alter in order to point to other structs, not that the user
will access the structs that are pointed to.  In fact, it will probably
be best to just pass pointers on to the user.

The second thing left to do is support for defining Objective-C classes.
I think this is going to be hard.  I've not looked at the problem in
detail yet, but it looks like creating methods and classes, and
registering methods and classes are all different actions that are all
handled differently depending on the runtime.  In the case of GNUstep, I
don't even know how to register new selectors yet.

Third, varargs.  These are easy to implement, but I'm not sure how they
should look like in the case of INVOKE.  Maybe a special keyword
indicator like :* would work for indicating the end of the method name,
but I think that could be a bit ugly.

We shall see.


** OpenMCL and Objective-CL Compared

On another note, I briefly checked out OpenMCL's support for Objective-C
by randomly typing a bunch of method invocations into the listener and
calling APROPOS a lot.  Here's what stuck:

1. You have to explicitely create selectors by using @SELECTOR.  Why is
that?  What's wrong with symbols and strings?

2. Strings designate only NSString objects, not C strings.  Why?

3. The bridge is just as fast as I expect using libffi from C to be on
that machine, that is, more than 20 times as fast as Objective-CL with
the speed hack enabled (250'000 method calls per second; my Inspiron is
faster, so don't compare this value to the ones above).

4. There's no FIND-OBJC-CLASS, but FIND-CLASS works.  Objective-C
classes seem to be normal CLOS classes whose names are found in the NS
package.

All in all, what struck me the most was the fact that the OpenMCL
Objective-C bridge does not seem to make use of the concept of
designators as much as Objective-CL does.  You have to define C strings
and selectors explicitely, which I consider a minor annoyance.  It's
faster, though.  Then again, considering that it's integrated into the
compiler, I was bit disappointed by the speed, because I figured that a
native-code compiler could do better than libffi (which is still a lot
slower than directly calling stuff from Objective-C).


** By The Way

Gorm rules.  We need to make Objective-CL fully Gorm-compatible.