RSS

(root)/packagedb/0.5.x : /docs/NewAPI.rst (revision 615)

To get this branch, use:
bzr branch /bzr/packagedb/0.5.x
Line Revision Contents
1 408.1.28
=================
2
New PackageDB API
3
=================
4
5
:Author: Toshio Kuratomi
6
:Date: 14 Oct, 2008
7 48.12.25
:For Version: 0.5.x
8 408.1.28
9
The first version of the PackageDB API grew somewhat organically.  This document
10
exlains what API changes are going into 0.4.x to create a more cohesive, easier
11
to use API.
12
13
.. contents::
14
15
-----------
16
Current API
17
-----------
18
19
There are three major parts to the current API that were driven by different
20
needs in the early development.
21
22
* ``pkgdb/acls/(bugzilla|vcs)``: Being able to get information comparable to
23
  the existing ``owners.list`` for the purpose of setting initialcclist and
24
  owner in bugzilla and generating acls on the cvs server.
25
* ``pkgdb/packages/name/$PKGNAME``: Being able to extract information from the
26
  packages for use by Bodhi_ and command line clients.  This mirrors the
27
  information available about a package in each page.
28
* ``pkgdb/packages/dispatcher/$METHOD``: JSON methods for the web UI to talk
29
  to when setting values.  This is the oldest of the interface and therefore
30
  has the most stuff that in hindsight was done wrong.
31
32
Two other parts of the interface are small and evolving.  They are somewhat
33
test beds for the new ideas in this document rather than legacy interfaces
34
that will need invasive changes to fix.
35
36
* ``pkgdb/users/``: needed to create a web interface for users to see which
37
  packages they owned.  Similar to ``pkgdb/packages/name`` this should let
38
  people extract the same information via json.
39
* ``pkgdb/stats/``: a web interface for people to see statistics about who
40
  owns how many packages.  This should also be constructed to let people
41
  extract the same information via json.
42
43
.. _Bodhi: https://fedorahosted.org/bodhi
44
45
Issues
46
======
47
48
Private Details
49
~~~~~~~~~~~~~~~
50
51
Currently, many of the controller methods in ``/packages/dispatcher`` use the
52
numeric id for the item they're talking about.  This is a detail that should
53
be hidden in the API.
54
55
New API Fixes
56
-------------
57
58
1. Use the username rather than the userid in all JSON API and on all web pages.
59
60
  A. This goes along with converting the database to `store usernames`_ rather
61
     than userids.
62
63
2. Use package name, not the numeric ids.
64
3. Send Collection Name and Collection Version, rather than Collection ID.
65
4. When talking about a PackageListing, send Package Name, Collection Name,
66
   and Collection Version instead of the PackageListing id. 
67
5. When talking about a particular acl, send the Package Name, Collection
68
   Name, Collection Version, acl name, and username instead of the acl id.
69
   (This example shows why we decided to send the id for things instead of the
70
   symbolic names.  However, there could be some ways to make this case better
71
   in the `Granularity`_ section)
72
73
.. _`store usernames`: NewDBModel.html#Change User and Group
74
75 408.1.30
Too Much Data
76
~~~~~~~~~~~~~
77
78
Currently, many methods return whole data structures from the database like
79
:class:`PackageListing`.  Because the remote end some other pieces of that,
80
for instance the list of :class:`PersonPackageListingAcls` on the
81
:class:`PackageListing`, we end up sending a huge nested data structure.  We
82
should make an effort to pare that down to a more reasonable sizeso that the
83
JSON data is smaller.
84
85
New API Fixes
86
-------------
87
88
Need to analyze the data that's actually being used vs what's being sent.
89
Then create different database queries that get more targetted data.  Also
90
need to preprocess some of the data so that we don't include so much
91
extraneous information when we finally send it.
92
93 408.1.28
Some Parameters are Composites
94
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95
96
Some parameters, especially ids that are passed to the dispatcher methods are
97
combinations of several values.  In other words, we sometimes see things like
98
this (Note: in the real code, the composite might also consist of numeric ids)
99
100
  /pkgdb/packages/dispatcher/set_owner/libfoo:FedoraDevel:toshio
101
102
being sent to a method like this::
103
104
  class Dispatcher(controller):
105
      def set_owner(self, divName):
106
          package, collection, newOwner = divName.split(':')
107
          pass
108
109
New API Fixes
110
-------------
111
112
Composites are a bit tricky.  Since we're going to stop sending numeric ids,
113
we'll have to send several pieces of data that's conceptually a composite to
114
the server.  We can either send them as three args and let the server deal
115
with it like this::
116
117
  /pkgdb/packages/dispatcher/set_owner/libfoo/Fedora/devel/toshio
118
119
  class Dispatcher(controller):
120
      def set_owner(self, package, collectionName, collectionVer, newOwner):
121
          pkgListing = PackageListing.query.filter(
122
                  and_(Package.name==package, 
123
                       Collection.name==collectionName,
124
                       Collection.version==collectionVer)).one()
125
126
or we can package it up on the client and send it to the server::
127
128
  class Dispatcher(controller):
129
      def set_owner(pkgListingId, newOwner):
130
          pkgListing = PackageListing.query.filter(
131
                  and_(Package.name==pkgListingId[0], 
132
                       Collection.name==pkgListingId[1],
133
                       Collection.version==pkgListingId[2])).one()
134
135
Since packaging the data into a single object (for instance a tuple/list) and
136
encoding that (probably with json) are extra steps, I'm inclined towards going
137
with option 1.  However, this does mean that sending things like acls will
138
have a very long argument string with no organization.  Hopefully, working on
139
`Granularity`_ will let us sidestep this issue in the majority of cases.  If
140
not, we should do the second one everywhere.  Consistency is very important
141
here.
142
143
Statelessness
144
~~~~~~~~~~~~~
145
146
The toggle methods in the current dispatcher class rely on the server not
147
changing the state of a toggle between the client rendering the page and
148
submitting the change request.  This doesn't work when multiple people update
149
the same information.
150
151
New API Fixes
152
-------------
153
154
1) Remove the toggle functions.
155
156
  A) Any replacements that are written should be done by setting something to a particular state.
157
158
*Bad*::
159
160
  def toggle_owner(self, package, collectionName, collectionVersion):
161
      pkgListing = find_package_listing(package, collectionName, collectionVersion)
162
          if pkgListing.owner == identity.current.user_name:
163
               pkgListing.owner = 'orphan'
164
          elif pkgListing.owner == 'orphan':
165
               pkgListing.owner = identity.current.user_name
166
          else:
167
               # Error, the new owner cannot become the owner of a currently owned package
168
               pass
169
170
*Good*::
171
172
  def set_owner(self, package, collectionName, collectionVersion, newOwner):
173
      pkgListing = find_package_listing(package, collectionName, collectionVersion)
174
      if allowed_set_owner(identity.current, pkgListing, newOwner):
175
          pkgListing.owner == 'newOwner'
176
177
Note that ``allowed_set_owner()`` in the second example hides the complexity
178
of deciding whether the user can set the owner to ``newOwner``.
179
``allowed_set_owner()`` would be more complex than what is in
180
``toggle_owner()``.
181
182
Unify CommandLine Client with WebUI API
183
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
184
185
Currently the Command line client has two methods that pretty much do
186
everything that the WebUI does with five to ten methods.  We want to merge
187
these so that the WebUI is just another client with a shared server API.
188
189
Separate Admin from User Functions
190
----------------------------------
191
192
Conversely, the command line client does a lot of things that a user should be
193
able to do.  But the APIs are locked so that only an admin can do them.  Admin
194
functions should be strictly separated from user functions so that only things
195
that really are admin-only have their own API.
196
197
New API Fixes
198
-------------
199
200
See the entry for `Granularity`_ for how we'll fix these issues.
201
202
Accessibility
203
~~~~~~~~~~~~~
204
205
We need to have some method of sending a request to the server via form
206
submission if the user's browser does not support AJAX.  To do this, the
207
client needs to provide a static html page to the browser.  A script on the
208
page will substitute dynamic entries for the static entries where appropriate.
209
For instance::
210
211
  <form>
212
  <input type="submit">
213
  </form>
214
  <script>
215
    jQuery.onload(// Create a handler to call an AJAX method instead of submitting this part of the page)
216
  </script>
217
218
If the browser supports scripts, then the submit button's action will be
219
overridden and we'll send AJAX requests as we currently do.  If the browser
220
does not support scripts, the submit button will POST the changes needed to a
221
server method which will return the refreshed page afterwards.
222
223
Granularity
224
~~~~~~~~~~~
225
226
Currently, each individual change on the Web App has its own method.  This
227
causes more round trips than necessary.  However, it does allow us to make
228
changes via AJAX since each method can operate simultaneously with the other
229
pieces.  We need to find a balance that does high level operations for speed
230
but allows us to make AJAX calls to perform those operations.
231
232
New API Fixes
233
-------------
234
235
There are several layers at which people may want to operate:
236
237
:_`Layer 1`:
238
  Make a single change.  This is what the API for the WebUI currently does.
239
  It's great for people who want to fine tune their acls on a single package
240
  but not so good for what we're really using the PackageDB for these days.
241
  This layer should probably be available but it doesn't (and can't) be
242
  optimized as well as the other layers since the grain is so fine.
243
244
:_`Layer 2`:
245
  Make a set of logical changes.  This includes things like: "Make person
246
  co-maintainer", "Approve All Pending Acls", "Take Ownership of Package in
247
  all Collections".  This is the level at which we usually want to operate.
248
249
:_`Layer 3`:
250
  Make a whole set of not necessarily logically related changes.  This is the
251
  ideal for the non-Javascript enabled version and may also be ideal for the
252
  command line client.  Make a bunch of changes to a package.  Hit Submit.
253
  All changes to the package are applied.  
254
255
We should be prioritizing `Layer 2`_ at the moment.  It is the layer at which
256
we can make the biggest difference in terms of usability for the end users and
257
responsiveness of applications.  `Layer 1`_ can be left in its semi-working
258
state for the moment and replaced with something that allows better
259
customization (maybe on the user's personal page).  `Layer 3`_ may need to be
260
worked on for accessibility reasons or we may be able to get away with doing a
261
non-optimal job with `Layer 2`.  (Multiple submit buttons on a page that hit
262
only commit a part of each page.)
263
264
`Layer 2`_ can have permissions set on logical units of change.  We should be
265
able to tell if a user is or is not allowed to ``watch_package()`` vs
266
``maintain_package()`` vs ``own_package()`` vs ``make_maintainer(other)``.
267
`Layer 3`_ would not be this granular which makes it harder to design
268
appropriate steps for.  `Layer 3`_ will have to check for proper permissions
269
for pieces of a requested change.
270
271
How to make this transition?
272
""""""""""""""""""""""""""""
273
274
1) Make a conscious effort to design and use API at `Layer 2`_.  When making a
275
   new element for a client (page or command line client), strive to use
276
   something that affects multiple logically related elements rather than
277
   individual objects.
278
279
2) Internally, move the code in the `Layer 1`_ Public API to their own class
280
   so we have very little code in the `Layer 1`_ controller.  Build `Layer 2`
281
   (at least, initially) by calling these `Layer 1` internal APIs.
282
283
3) Identify bottlenecks where we call one `Layer 2`_ API which makes an
284
   internal call 10 times which calls the database 10 times each.  Optimizing
285
   those to make less database calls will be a priority.
286
287
Logging
288
~~~~~~~
289
290
Logging each change to a set of acls independently is bad.  We want to log
291
changes to the set of acls as a single unit.  However, we want the user to
292
still experience each acl updating independently on the web.
293
294
New API Fixes
295
-------------
296
297
This can be done by making logging an asynchronous process.
298
299
1) Log to the db immediately along with the rest of the transaction just like
300
   now.
301
2) After a timeout, a process runs that looks through the db for all logs
302
   since the last run, picks out logs that should be bundled together, and
303
   sends a combined log to mailing lists, etc.
304
305
Error Handling
306
~~~~~~~~~~~~~~
307
308
Error handling is currently pretty ad hoc.  We return a variable, status with
309
every call to dispatcher.  This variable contains True if everything was fine,
310
False otherwise.  A message variable is sent if status==False.  We want to
311
standardize this in some manner so that our common client base class can take
312
error messages from bodhi or pkgdb with equal ease.
313
314
New API Fixes
315
-------------
316
317
We have two issues that need to be resolved:
318
319
1) How do we determine if a call produced an error?
320
2) How do we get a useful message from the error?
321
322
We're going to implement a standard for all Fedora Web Services via BaseClient
323
that resolves this.  It will use a variable named `exc` that returns an error
324
name (could be an actual exception but we currently don't want to raise these
325
on the client because we haven't evaluated what harm a server could do to a
326
client which raises arbitrary exception names) and a message in tg_flash.
327
BaseClient will take care of raising an exception from this information for
328
the client code to consume.
329
330
DB Changes
331 408.1.34
~~~~~~~~~~
332 408.1.28
333
Please see the New DB Model Documentation.  Some of these changes will be
334
internal and some of this will be externally visible.
335
336
.. toctree::
337
   :maxdepth: 2
338
339
   NewDBModel.rst
340
341 408.1.34
PEP8ify and Shorten Variable Names
342
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
343
344
Names of public variables and parameters are unwieldy in several ways:
345
346
1) They are too long.  This is especially seen in database column names.  This
347
   is bad because returning values as json has to repeat the column name
348
   multiple times.
349 48.2.161
2) They do not comply with PEP8_.  PEP8_ specifies underscores to divide words.
350 408.1.34
   We presently use camelCase.
351
352 48.2.161
.. _PEP8: http://python.org/dev/peps/pep-0008/
353 408.1.34
354
New API Fixes
355
-------------
356
357
Some of these changes will be fixed via `DB Changes`_.  Below
358
is a list of changes to parameters and return values unrelated to the
359
database.  In general the following rules are followed:
360
361 48.2.154
1) If it's a db object, it follows the db object spelling
362 408.1.34
2) If it's two words in standard usage, it becomes two words separated by
363
   underscore.
364
3) If it is abbreviated in one place, it is abbreviated everywhere.
365
4) Numeric id parameters are being removed.  We'll pass symbolic names or fully
366 509
   expanded names instead.  (pkg='foo', collctn='F-8' instead of
367 408.1.34
   pkg_listing_id='10998')
368
369
========================  ================
370
Current Name              New Name
371
------------------------  ----------------
372
package                   pkg
373
packageName               pkg_name
374
package_name              pkg_name
375 537
packageList               pkg_lst
376 509
collection                collctn  (Use this for collection short name)
377
collectionName            collctn_name
378
collectionVersion         collctn_ver
379 537
cc_list                   cc_lst
380 408.1.34
qacontact                 qa_contact
381
fasname                   username
382 537
pkg_listing_id            pkglisting_id
383 509
collection_id             collctn_id
384 408.1.34
========================  ================
385
386 408.1.28
-----------------
387
Internal Cleanups
388
-----------------
389
390
391
Separate Display from Manipulation
392
==================================
393
394
:Status: Design
395
396
Because of the way decorators work, the controller methods are tightly bound
397
to being called from the web.  We cannot call a decorated controller method
398
and expect to get back reasonable information (often we will get an error).
399
We need to change controller methods to be a lightweight layer that feeds the
400
templates/json.  These talk to other functions/classes that do actual
401
manipulation of the database objects.  This allows us to reuse code much
402
easier since we are cleared about what should go into a controller and what
403
should be in its own function.
404
405
Organization
406
~~~~~~~~~~~~
407
408
A shadow class was considered but didn't seem very elegant::
409
410
  class Dispatcher(controller):
411
      pass
412
      # Variety of public methods that are @exposed to the web.
413
414
  class DispatchDoer(object):
415
      pass
416
      # Dispatcher methods are just a thin layer that calls methods in this class
417
418
419
Adding these methods to the mapped classes from the model is currently being
420
tried::
421
422
  class Dispatcher(controller):
423
      def change_owner(self, pkg, newOwner):
424
          pass
425
          PackageListing(pkg).set_owner(newOwner)
426
427
  class PackageListing(SABase):
428
      def set_owner(newOwner):
429
          if can_own(newOwner):
430
              self.owner = newOwner
431
432
Porting
433
-------
434
435
No matter what we come up with, there will be considerable internal
436
reorganization to make this work.  However, this should all be hidden from the
437
end-user behind the JSON API.  So there should be no external porting
438
required.

Loggerhead 1.18.1 is a web-based interface for Bazaar branches