Showing posts with label Computer Programming. Show all posts
Showing posts with label Computer Programming. Show all posts

Sunday, December 31, 2006

Gray Areas In Java

In case you ever wondered what weak and soft references are in Java, here are couple good articles explain what they are and when you might use them:

Tuesday, December 05, 2006

Forward Declarations And Faster Compilations

In C++, a Forward Declaration is a declaration of a type that occurs before the definition of that type. For example, suppose you write the small program:
#include <iostream>

int main(void) {
  doSomething();
  return 0;
}

void doSomething(void) {
  std::cout << "Hello World!" << std::endl;
}


This program, won't compile - you'll get an error like dosomething.cc:4: error: `doSomething' undeclared (first use this function) because doSomething() is called in the source code (line 4) before it is visible (line 8 onwards).

The simplest way to fix this is to simply switch the order that main() and doSomething() appear in the file. i.e.
#include <iostream>

void doSomething(void) {
  std::cout << "Hello World!" << std::endl;
}


int main(void) {
  doSomething();
  return 0;
}


Another way to correct the file is to use a forward declaration of doSomething(). i.e.
#include <iostream>

// forward declaration
void doSomething(void);


int main(void) {
  doSomething();
  return 0;
}

void doSomething(void) {
  std::cout << "Hello World!" << std::endl;
}

The (potential) advantage of this approach is that it gives you the ability to layout the order of the file as you see fit (e.g. to make it more readable) while satisfying the compiler (language requirements).

Now consider the following larger example with three classes:
Name.h
#include <string>

class Name {
  public:
    Name(const std::string & firstName, const std::string & lastName);

    std::string getFirstName() const;
    std::string getLastName() const;

  private:
    std::string _firstName;
    std::string _lastName;
};


EmailAddress.h
#include <string>

class EmailAddress {
  public:
    EmailAddress(const std::string & emailAddress);

    std::string getEmailAddress() const;
    std::string getDomain() const;

  private:
    std::string _userName;   // e.g. "JohnSmith" part of "JohnSmith@gmail.com"
    std::string _domain;    // e.g. "gmail.com" part of "JohnSmith@gmail.com"
};


Person.h
#include "EmailAddress.h"
#include "Name.h"

class Person {
  public:
    Person(const Name & name, const EmailAddress & emailAddress);

    const Name & getName() const;
    const EmailAddress & getEmailAddress() const;

  private:
    const Name & _name;
    const EmailAddress & _emailAddress;
};


Person.cpp
#include "Person.h"

Person::Person(const Name & name, const EmailAddress & emailAddress)
  : _name(name), _emailAddress(emailAddress)
{
  // empty
}

// etc.


NameFunctions.cpp
#include <iostream>
#include <vector>
#include "Person.h"

void printFirstNames(const std::vector< Person > & persons) {
  for( std::vector< Person >::const_iterator iter = persons.begin();
      iter != persons.end(); ++iter ) {
    std::cout << (*iter).getName().getFirstName() << std::endl;
  }
}


Notice how Person.h includes both Name.h and EmailAddress.h. Consider what happens when NameFunctions.cpp is compiled. The preprocessor must replace the include of Person.h with the contents of that file. To do this, the preprocessor first recursively includes the contents of Name.h and EmailAddress.h in Person.h since Person.h includes these two files. Therefore, to build NameFunctions.cpp, four files must be loaded and processed - NameFunctions.cpp as well as its includes (Person.h) and the includes' includes (Name.h and EmailAddress.h).

Similarly, consider what happens when you change EmailAddress.h and then run a program like make to recompile any impacted source code. Make will consider all the files that could be affected by this change, of which NameFunctions.cpp is one of those files! (It consumes EmailAddress.h indirectly, as shown above). Likewise, when you are computing the files that depend upon EmailAddress.h, the calculation is non-trivial since you need to traverse from EmailAddress.h to Person.h to NameFunctions.cpp.

So what do all these includes mean for your compilation times?

  • Compiling a source file involves loading a lot of header files into memory (and not just all the ones listed in the source file; but the includes' includes, the includes' includes' includes, etc.), which can be many more files than you would suspect. In a large project, you are going to have a lot of these files, and they will be scattered across your disk and each will be used only some of the time. i.e. The limited locality and large input set won't be favourable to caching and the hard drive could be hit often (which is slow).
  • Dynamically figuring out dependencies is costly, since the graph of related files has a lot of edges. (And this can, again, involve lots of disk accesses).
  • Dynamically figuring out what dependencies have changed since you last ran make is expensive because the dependencies are expansive. (Again, lots of disk accesses).
  • When you change a header file, there will be a cascading effect of rebuilding a lot of source files.

Now, is all this work really necessary? First notice, that NameFunctions.cpp technically depends on EmailAddress.h (since Person.h does), but NameFunctions.cpp only really uses the Person and Name classes (but not the EmailAddress class). Next, notice that the definition of Person in Person.h only uses references to Name and EmailAddress. Hence, the compiler only needs to know that Name and EmailAddress are classes when it processes Person.h, but not the substance of those classes. (As the references are really just memory addresses, i.e. 1 word, the memory layout of Person is independent of the contents and interface of these two classes). Therefore, we can replace the two lines in Person.h:
  #include "Name.h"
  #include "EmailAddress.h"

with the lines:
  // forward declarations
  class Name;
  class EmailAddress;
.
And the start of NameFunctions.cpp must now be changed from
  #include "Person.h" to
  #include "Person.h"
  #include "Name.h"

(as the function in this file invokes Name::getFirstName(), a forward declartion does not suffice, it needs to know the definition of Name).

What are the consequences of these changes? The code still compiles, but faster:
  • Compiling NameFunctions.cpp no longer involves loading EmailAddress.h.
  • Figuring out the dependencies of NameFunctions.cpp is simpler because there are less dependencies (and they are likely more stable).
  • Figuring out what files need to be recompiled is less costly because there are less dependencies to examine.
  • Rebuilding the affected source code after a change to EmailAddress.h is less work because the (falsely) dependent file NameFunctions.cpp no longer needs to be built.

In addition to slower compiler times, using includes in headers (instead of forward declartions) can cause other problems. e.g. Suppose A.cpp uses the class C, but gets it indirectly via including B.h instead of C.h
  • What happens if B.h is changed to no longer include C.h? Then A.cpp no longer compiles, an unfortunate side effect. Hence, the code (without forward declarations) is brittle and not self-documenting (as A.cpp's dependency on C.h is not explicit).
  • Also, dependencies are more likely to be properly calculated. If your dependency calculation is shallow (i.e. it determines that A.cpp depends upon B.h, but not that A.cpp depends upon the dependencies of B.h, such as C.h) What happens if the definition of C changes in C.h? A.cpp will not be recompiled and the resulting program execution is likely to exhibit bizarre behaviour (i.e. if C's v-table was altered).
  • Searching for dependent files by using a command like grep "#include \"C.h\"" will be miss finding A.cpp.

Sunday, November 26, 2006

What Does Peanut Butter Tell You About Work?

The past year at work has been an...interesting?!...one and I know a lot of my friends and co-workers would have a lot to say about it too. Recent news like Yahoo's Peanut Butter Memo and Siebel and IBM settling class-action lawsuits regarding overtime feel somewhat relevant and only give me pause to think about my current situation and where I'd like to be...I'm not really sure. I'm not even sure that work can make you happy anymore...I suppose this means I should either be doing something else or find something outside of work that makes me happier...

Monday, November 06, 2006

Memories of 342

Building a better HashMap is a short, but interesting, article on IBM's website about how java.util.concurrent.ConcurrentHashMap is implemented to provide much efficient concurrency than java.util.Hashtable.

Tuesday, October 24, 2006

Good Agile, Bad Agile

I just finished reading Steve Y's Good Agile, Bad Agile. I think this anecdote was my favourite part:

Most engineers are not early risers. I know a team that has to come in for an 8:00am meeting at least once (maybe several times) a week. Then they sit like zombies in front of their email until lunch. Then they go home and take a nap. Then they come in at night and work, but they're bleary-eyed and look perpetually exhausted. When I talk to them, they're usually cheery enough, but they usually don't finish their sentences.

Sunday, October 22, 2006

Work

Working at Microsoft is an interesting essay. (It's one of the many articles I bookmarked and took forever to get around to reading, so I'm not sure where I found it). I can relate to many of the points in it. I especially liked the subsection "Managers". The statement "...the Seattle area is known for being somewhat isolating — lots of young, ambitious professionals with no time for making friends..." is also too true.

Thursday, October 19, 2006

CRC

In case you ever wondered how CRC worked (or how to use what you learned in Polynomials, Rings and Finite Fields), Cyclic Redundancy Check gives a short description of the math behind CRC as well as how to implement it as an algorithm.

Monday, October 16, 2006

Web Services

I transfered to the Web Services group at the start of October. It's a little strange to not know what I'm doing, anyone I work with, or anything about where I work (as it's in a different building in a different part of town). It's kind of like the start of a co-op work term all over again. I think I needed a change of scenery though and I'm excited to try out something new.

The people I work with seem nice enough and I'm slowly figuring out the project (which has a lot of subtle points). The location of the office isn't very good (no Tully's; not much in way of lunch choice) and the bus service reminds me way too much of living in Kanata - infrequent, indirect, and has an inconsistent schedule. What distinguishes Seattle from Kanata, however, is that it is unreliable, which is quite impressive!? (Disappointing!?) [Recall that I live in downtown and I work in downtown; whereas Kanata is the suburbs of the suburbs...]. The office does have great views though. I'll have to bring my camera to work some day and take a few shots of the Seattle skyline.

In case you don't know what Web Services are all about, the articles Amazon Web Services and Extending Web Services Using Other Web Services from Linux Journal should give a sense of why they are all the rage. (Although, what's described in the articles isn't really at all what I do, so don't read too much into them! :-).

Sunday, October 08, 2006

Learn Ruby Or Python?

I've been reading a lot of short computer-programming articles online lately, such as Exception-Handling Antipatterns, which I bookmarked a couple weeks ago (as a possible reference for a talk about Defensive Programming that I gave two weeks ago), but didn't end up reading it until today.

I also learn about Duck Typing (via Joel on Software's Ruby Performance Revisited). The articles The Perils of Duck Typing and Java does Duck Typing were interesting follow-ups.

I think I need to get around to learning Ruby. I've been meaning to for a while, but have never gotten any farther than bookmarking Prgramming Ruby and Why's Poignant Guide To Ruby. Do you think I should teach myself Ruby or try learning Python first?

Monday, June 26, 2006

A Couple Interesting Articles

Bush Is Not Incompetent - it's worth reading and it won't be what you expect.

Inversion of Control Containers and the Dependency Injection Pattern - an older computer science/programming article, but I found it enlightening. (Perhaps the title doesn't make a whole lot of sense, but if something has a title that long and still isn't clear then I can't succinctly explain what it's about either! ;-)

Monday, April 10, 2006

Are Software Patents Evil?

Patent trolls are companies consisting mainly of lawyers whose whole business is to accumulate patents and threaten to sue companies who actually make things. Patent trolls, it seems safe to say, are evil. I feel a bit stupid saying that, because when you're saying something that Richard Stallman and Bill Gates would both agree with, you must be perilously close to tautologies.

Are Software Patents Evil? is (another) well-written essay by Paul Graham. (It even uses hockey as a metaphor!)

Saturday, February 25, 2006

Great Design

Joel on Software has started a series of articles on Great Design.

Saturday, February 11, 2006

The Pragmatic Programmer

I finished reading The Pragmatic Programmer a couple weeks ago. (And have been meaning to mention it, but kept forgetting). The book outlines an approach to software development summed up by its first "tip": Care About Your Craft - Why spend your life developing software unless you care about doing it well?

What you learn in school tends to be either aspects of Computer Science or the syntax and features of various programming languages. Both are useful, but developing software and working with complex systems have many other facets, many of which are best learned though experience. The book is a collection of short chapters that discuss what the authors' have learned about the later.

The engineering process - creating maintainable and testable software, design, documentation, and debugging - receives a lot of coverage. But other engineering topics and skills, including tools, architecture, communication, scoping and estimation, as well as career development, are discussed. The writing style is clear and full of examples, but concise (in a good way).

I highly recommend this book as it will (likely) introduce many new and helpful ideas as well as solidifying things that you may have learned along the way but aren't quite sure how to articulate or reason about.

Thursday, March 17, 2005

Politics-Oriented Software Development

Documentation is an essential tool in the twin goals of ass-covering and of managing management...strike the correct tone of opaque vagueness and unshakeable authority.

A kind of funny (but cynical) article about Politics-Oriented Software Development.

Friday, March 12, 2004

No More RPC

I just submitted the CS 454 assignment. 3788 lines of code plus a couple of documents about how wonderful it is. [Well hopefully the TA thinks its wonderful :-)]. That is the end of my computer programming assignments as an undergrad student...Wow...That is a very nice realization (as I'm tired of being sleep deprived while chasing down segmentation faults..). :-) Time to make plans for going out tonight.

Thursday, March 11, 2004

The return of f0(int, int) is: 15

Just under 32 hours to go...the middleware layer actually works now! Well it works for a procedure that adds 5 to 10 and gets 15. My to do list is down to 4 things plus testing (and fixing bugs...well hopefully not) and writing all the documentation. Not too bad. I think I'll take a break...

Wednesday, March 10, 2004

Blood-Shot Eyes

Man, my distributed computing assignment is driving me crazy. It is due in about 46 hours and I still am not done coding (so I can't even test)...too many crazy errors and too many pieces...I'm almost done...have to write 3 more things before I can get the client and server talking, plus my to do list has 4 other things that I should...plus I need to write server-side skeletons and write a real client and server that I can use for testing...(and I haven't even thought about the write up that need to go with the RPC layer)...Procrastination is a bad thing.

Sunday, February 22, 2004

Time Flies

I can't believe reading week(end) is over already. :-( I did too much homework to relax enough and too much slacking to get enough homework done.

Probably the most interesting thing I did was going Friday night with a bunch of my friends to an African-cuisine restaurant downtown. I don't think any of us were impressed. The service was really slow. I had a beef dish with beef chunks, sauce, and according to the menu, there were a few vegetables in there too, but there were so few, they were hard to find. It wasn't warm enough either. (Cold food isn't usually good food). The salad consisted exclusively of lettuce and tomatos - nothing else. And I don't like tomatos. The bread was interesting. The plates were pizza dishes and the bread covered the entire plate. It was kind of like a pancake, but cold and rubbery. For the portion-size and the quality of service, the prices were too high too. To be nice though, it was better than the last time we went to Jack Astors...

Other than that I've been working on Distributed Computing a lot. Aside from the midterm on Wednesday, there is a big programming assignment, where we have to implement a RPC middleware with a binder process and a bunch of other bells and whistles. It's not too terrible, except I couldn't find a partner, so I'm doing it all myself. (Hopefully the TAs will be cool with that). I figure I'm about 40% done. Most of the socket stuff is out of the way (I forgot how crazy some of socket stuff can be!). I have the server talking to the binder and vice versa, but I need to write the part where what they say is meaningful. (I think I'm into the outside-in programming methodology - kind of a blend of top-down and bottom-up!). For the client, I can rip-off (i.e. reuse) most of what I've already written as (I think) I've made it all nice and generic.

Saturday, January 31, 2004

Busy Day

I just got back from Daniel's/Geoff's/Andrew's where they were deep frying stuff. Lots of people were there; most of them I haven't seen in a while too as I'm not in any Pure Math classes these days. It was good times. [I missed the Leafs game though - they got revenge against Ottawa, winning 5-1. :-)]...I spent the afternoon at school working on Waterloo's January local programming contest. It was hard problem set. The standings are here - I was 15th out of 44 (or so). I only got one problem (E). I couldn't figure out how to properly solve the next-easiest (C), so I spent the rest of my time on the only problem no one solved (A).