Deep dive into HCL’s strategy for Domino/Notes

HCL was kind enough to invite me to their CWP factory tour themed from the story Charlie and the Chocolate factory, held in Chelmsford, Massachussetts on July 11-12 2018.

oompa_loompas

We already knew from Engage and DNUG that Richard Jefts has no qualms in donning a silly costume, but seeing Jason Gary as an Oompa-Loompa was truly memorable. We now have photos and a means to blackmail them.

Many thanks to them and the HCL Staff for organising the event, it was definitely worth the trip.

Letting the developers loose.

Over the course of two days, the HCL Staff, the core developers of Domino 10, showed us what they had been working on. We had unprecedented access to the core team, and the interactions – on the first day, a series of presentations/workshops and on the second day, an impromptu ‘Ask the developers’ session –  were very fruitful. And the communication was two-way too – on numerous occasions, when we made requests, the developers were surprised or suddenly understood where we were coming from.

I had a sense of the developers being let loose after years of being held back, and it’s obvious that they are completely aware of the pain points that external developers and business partners have had, and are happy to have the opportunity and resources for actively solving them.

ask_the_developers

don’t ask about the chicken.

HCL’s strategy

The penny dropped a number of times as I suddenly started to understand exactly what was coming our way, and how the different improvements and new developers fit together in the broader HCL strategy.

Business: play to Domino’s strengths: easy AppDev and secure, low-maintenance, on-premises infrastructure

darren_oberst

We had the pleasure of listening to Darren Oberst, who heads HCL’s PCP division and explained to us the logic of HCL’s investment in the IP of Domino/Notes. And it’s consistent. HCL has recognised that not all customers are cloud-frenzied and that having one’s own infrastructure, on-premises, is valuable when data privacy and control is important. This is especially important in the German and Swiss markets.

It makes a refreshing change to IBM’s mantra of ‘cloud first’ which left Domino sticking out as a sore thumb in IBM’s portfolio.

HCL are making the web-based mail client, Verse, work well on-premises, for instance without the need of a separate IBM Connections environment.

There is a sweet spot for Domino as a ‘one-server-does-all‘ for the small and medium businesses which was largely ignored by IBM in the past.

I’m hoping that, with some judicious investments in modernising Domino, we business partners can start offering Domino to completely new customers, instead of restricting ourselves to customers who already have an installation.

Increase robustness of the Domino server

There is a team called ‘Total Cost of Ownership’ whose purpose I didn’t initially understand. The penny dropped as I realised that the work is to increase the robustness of the Domino server, to make it an attractive investment for SMBs also, who don’t have the size to have their own dedicated staff but still want an on-premise infrastructure. In short, to make Domino an ubiquitous offering, which is great!

Out of the many improvements that were announced, the ones on the 64GB database limit was very welcome, as was the commitment to breathing some life into Sametime, including getting rid of the monstrous Websphere database in the background. My jaw dropped as one of the casual comments was that the Search was going to be upgraded so that the indexes get immediately updated with new entries. This has been such a pain point in the past explaining to customers why they can’t find the information they’ve just inputted, and I wasn’t expecting a resolution any time soon.

Impressive, also, were the self-repairing logic that is going to be included. Fixups happening automatically, automatic checks that replicas are correctly distributed, and automated recoveries of corrupt databases. This goes completely in the direction of making the domino server so reliable that it can be offered to SMBs without qualms.

Node.js opens up Domino to modern web developers.

It took me a while to get this. The essential point is that with the new integration with node.js, modern web developers do not need to have any Domino-specific knowledge to start coding with Domino as a back-end NoSQL server. All this new stuff is not aimed at hard-core Domino developers, who already know all the hoops and workarounds needed to get Domino speaking to other systems. It’s ‘code with Domino without opening Domino Designer’.

The DominoDB module will be available as an npm package. Download it, require it, and you can create a connection to a database and start making CRUD operations against it. To complement this (and this was a welcome surprise here) the HCL Team are working on a SQL-Like query language (Domino General Query Facility) which will return JavaScript Arrays of documents. This promises higher query speeds and will be available from any of the languages, and can be run by any of the Domino languages.

Another pain point that was addressed was Identity and Access Management (IAM) which will follow OIDC/OAuth standards, and will be able to be accessed as a service.

Hackathon

Three teams participated in HCL’s Hackathon. (I can’t yet say what one of the teams did)

‘web app start’ application

runners_up_hackathon

Graham Acres, Richard Jefts, Bernd Gewehr, Jason Gary, Christian Güdemann, Thilo Volprich, Thomas Hampel

This team created in no time at all a ‘web app start application’ that locally stores credentials PWA Application that could connect to a Domino Server once and once only and then securely locally stores the credentials and allows to redirect to other domino-based web apps. Wonderful, and a tour de force in the time available.

Enabling Domino Compatibility for modern apps built with Play Framework

hackathon_winners

Richard Jefts, Dan Dumont, Andrew Magerman, Ellen Feaheny, Jason Gary, Steve Nikopoulos, Dave Delay,

Our team pursued a more conceptional idea (we didn’t code anything).

Ellen Feaheny leads AppFusions, a company that sells IAAS (Integration as a service) products, and came up with an idea of trying to leverage Domino with apps built with the Play Framework.

Play, by Lightbend, is used by enterprise corps for micro-services architectures using Javascript, Scala, or Java. It is especially good for apps that require large scaling design from the get go.

We kidnapped the HCL Team doing the node.js integration and brainstormed with them on how this could be implemented, sorting through the high-level requirements.

How does the gRPC protocol work?

Steve Nikopoulos, and Brenton Chasse spent a long time holding my hand and explaining to me slowly how the gRPC protocol works.

As far as I’ve understood:

  •  It’s a very efficient protocol between two machines to deliver remote procedure calls, i.e. calls to execute something on a remote machine, using commands that look local to the caller.
  • The transport is done over HTTP 2.0 (so there’s already a speed advantage because the connection remains open).
  • The messages sent are small, because they are just within the TCP envelope of the HTTP call, and because the format of the message is predetermined and the declarative part of the message is kept to a minimum through diverse tricks.
  • Crucially, messages can contain other messages, so you have an out-of-the-box possibility of sending a tree of nodes, which I found very satisfying.

Think of it of as a very clever lossless compression for messages which have a predetermined format (for instance, ‘bulk update documents’, or ‘delete this document’).

The starting point for defining the protocol is a file called a .proto file which defines very exactly the data structure of the messages. (see grpc.io). Once these are defined, you use a protocol buffer compiler to generate data access classes in any of the supported languages (all the favourites are already there).

Put it another way: gRPC comes with tools that enable you to create ‘stubs’ in any computer language that you want that will be able to ‘compress’ and ‘decompress’ the message. As an analogy, think of the stubs that were generated when building SOAP interfaces.

Our idea:

If HCL makes the .proto file for the gRPC protocol public, then anybody can create an interface to the gRPC Server which will come with Domino.

This was just a concept, and there are still some open questions, i.e. who supports the solution in case something goes wrong, how does one organise versioning, how does one ensure compatibility between different versions of the protocol…

Andrew Davis told us that the HCL Team loved the idea, and that idea was rewarded by the first prize! In fact, Andrew even committed in front of the audience to work with AppFusions (and others who wish to get involved) to see how this could be made real, sooner than later.

Conclusion

Domino is going places. The CWP Factory Tour was an exhilarating meet-up of the core Development team and the core Domino community, with very productive exchanges of ideas. I haven’t been this optimistic about the future of Domino/Notes in years. Watch this space!

Please call me if you want some help or advice about the future of Domino!

cwp_attendeesThe lucky attendees

Modernization of Ortho-Team’s Reservation System, a success story

In 2017 Ortho-Team released a new version of their website, and the optics of the reservation system clashed with the new website. Something had to be done.

Ortho-Team is a leading Swiss provider of orthopaedic products. Most products are made to measure for its customer, which means every sale is linked to at least one appintment for advice and measurement. Ortho-Team had a reservation system based on its IBM Domino infrastructure. Customers can choose an appointment for the location and product they need. These appointments are dynamically calculated from the available free time of all Ortho-Team employees.

The original system was built using classic Domino Web technology and was working reliably and robustly. But the design had become old-fashioned, and crucially the reservation system was unusable on mobile phones.

Decision: keep as much of the existing system

After considering several different scenarios, we decided to stick to the classic, robust technology in the back-end and revamp the UI with a modern, reactive interface. The reasoning was that this had the lowest potential impact, and the costs would also be the lowest for the desired effect.

The User Interface Design and coding of the necessary HTML and JavaScript files were done by the talented team at e621.ch. The coding on the Domino side was done by Magerman GmbH.

We’ve shown the results in the following pictures. The transition is now seamless!

If you want to update your existing classic Domino Web applications, don’t hesitate to contact us!

Lessons learned:

  • If you have classic Domino Web applications, it’s possible and cost-effective to revamp the UI using modern, reactive Front-End technologies.
  • The biggest limitation is that we had to stick to the original workflow of consecutive sequential pages. In Classic Domino Web Development, business logic is carried out on the server by serving sequential pages, and we wanted to keep the business logic where it was. A ‘single-page’ UI was not conceivable.
  • We stumbled from time to time on technological limitations of the Domino platform, but could quickly find alternatives.
  • From the start, we set up a testing environment which was accessible to all parties involved. This proved to be very effective; we ended up finalising the end product through many small iterations.

Engage 2018 recap

Engage 2018 was the best Engage I’ve ever attended. Not only was the organisation and venue superlative, but the recent energy of the new HCL team was infectious.

IBM’s Bob Schulz delivered a soporific and uninspired Keynote speech, in sharp contradistinction to Richard Jeft’s and Andrew Manby’s energetic, and funny, presentation. Their enthusiasm was infectious.

OGS_Engage

New announcements:

HCL Nomad

Most promising was the presentation by Andrew Davis showcasing Nomad, which is a Notes Client running on an iPad. He demoed the current version, which only works online. They are working on improving the controls (dropdowns, checklists etc.) to work on a touchscreen, and that seemed to be working well. Links (doclinks, database links) are working.

The next necessary step is making Nomad work offline, which would imply implementing the replication engine, but also implementing a local full text indexing. I imagine that that will be challenging. Promising, also, is the extension of LotusScript to be able to access the mobile devices’ goodies such as cameras, location.

I can see a good use case for field service workers who need a fully offline machine, and I can also imagine this would be a very easy way to surface pure Notes Client applications to the iPad with very little cost (since the codebase must not be modified).

I’m not convinced with the attempt to make Notes applications work on the iPhone, the format is just not compatible with the existing application layouts.

Andrew_Harris_Engage

Andrew Davis showing how Node.js will be integrated into the Domino stack – architecturally elegant, methinks.

HCL Places

A surprise was Jason Gary’s prototype demonstration of ‘HCL Places’, an universal integrated chatclient/Notes Client which would run in a single application, with data storage on the Domino server. It’s one of those products that HCL can market independently. Looks promising, but probably needs a lot of work for it to fly.

HCL_Places

Notes Client:

Ram Krishnamurthy showed some of the new changes in the Notes Client. I found most of them small incremental UI changes. The Workspace is getting a makeover, and I hope that what was shown is not the latest offering – the spacing of the tiles was all wrong in my opinion.

The real improvements was the decision to merge both mac and Windows codebases for the client (presumably as a result of high mac usage within IBM) and the simultaneous release of all codebases (including Sametime, which caused much of the headaches of Domino 9.01FP10). Both these changes should improve the quality of the delivery.

As a side note, it’s interesting that both HCL Places and Nomad are integrating the ‘pure’ C++ Notes client. This reinforces my feeling that there is true beauty and robustness in the ‘small’, ‘basic’ client, and that the 2008 revamp using Eclipse RCP is just a flaky, clumsy abomination whose costs (slowness, size, difficulty to configure, architecture breaks) are not compensated by its advantages. HCL announced they would sunset unused functionalities, and that is welcome. Hogne B. Pettersen suggested to ditch the RCP implementation entirely, and I  secretly cheered.

XPages:

some small improvements announced, but I think it’s clear XPages are a dead end.

New interesting technologies and skills:

Of particular interest was Knut Herrmann and Paul Harrison’s session on Progressive Web applications, an easy way to adapt a web application so that it behaves like a native mobile app, without the hassle of passing it through the testing of an app store.

ENGAGE_2018 - 2

Also fascinating was Paul Withers’ and John Jardin’s session ‘Domino and Javascript Development Master Class’. Node.RED is definitely a technology I’d like to look in. As a tip, Paul suggested making REST Tests directly with Node.RED, and not with Postman, for instance. And Node.RED is an easy way to deliver scheduled actions.

ENGAGE_2018 - 1

People

Most enjoyable is that Engage is meeting the community. Probably because of the long history of the platform, real ties of friendship make this community special, much more than any of the other gatherings I’ve attended over the past years. There is always somebody who can help!

IBM_Champtions_Engage

IBM Champions on stage

Can HCL deliver?

That’s the crucial question. Whereas I am impressed by the new team’s vision and obvious energy, my gut feeling is that they are overambitious and risk underdelivering. Jason Gary mentioned ‘we’re just like a startup’, which made me wince since in this space, with a large legacy codebase, you don’t have the luxury of a green field. The risk I see is V10 breaking existing functionality, which would be fatal.

During questions and answers I asked Richard Jefts: ‘who was responsible for the Notes 9.01FP10 release?’, as the quality assurance on that release was sub-par. Richard took it on the chin, to his credit, assumed responsibility, and told us that a separate QA post had been created to avoid this happening again. There again, the risk here is delivering under time pressure a software product that isn’t sufficiently tested.

Thanks to the Engage Team and the sponsors

Theo Heselmans made a sterling job of organising the event, as usual, and the venue was particularly well suited. The venue was the spectacular SS Rotterdam, a permanently moored cruise ship. The sessions were in the ship, we slept and ate in the ship. Many thanks to the sponsors which makes this event possible, and free!

SS_Rotterdam

On a personal note, the gods of luck were with me and I won two the two prizes that tickled my fancy: a LEGO set from OnTime and a Raspberry Pi housed in a BBC Micro B housing from LDC Via. Awesome!

andrew_magerman_ldc

Matt White from LDC Via presenting me with their awesome giveaway – a Raspberry Pi housed in a BBC Micro B

IBM Champion!

IBM announced the nominations for IBM Champions 2017 and I’m very happy to say that I was nominated.

It’s wonderful to have this sort of recognition and suddenly being part of a group who have made huge contributions to the IBM ICS Community of the years is humbling.

I strongly suspect that my nomination is linked to my co-organising the Swiss Notes User Group, so I’d like to mention at this point my co-conspirators, without whose work SNoUG wouldn’t happen:

Diana Leuenberger, Susanne Schugg and Helmut Sproll

And also many thanks to those that have contributed behind the scenes to make SNoUG a success – I think in particular here of Paul Harrison, with his wonderful AgendaWALL and AgendaPAD, as well as the very talented LDC Via Group, who help me whenever I’m stuck.

I’d also proud of strengthening the Swiss IBM Champions group – showing that plucky little Switzerland can pack a good punch!

From 3000 milliseconds to 200 to open a page: Speed up your Notes Client applications by discovering their bottlenecks.

I had inherited a template for a database which was taking 3 seconds to open any sort of document. It soon got to be irritating, and I noticed that the little lightning bolt at the bottom left side of the Notes client was flashing wildly, so I fired up the Notes RPC Parser to try and see what was all the traffic about.

The culprit was soon found: it was a subform containing buttons whose label was dynamically computed depending on the user’s desired language. And, unfortunately, it was making an @DbLookup for every single button, and an uncached one at that. If you look at the screenshot, you’ll see that an @DbLookup generates 3 NRPC Calls, for a total of 50ms – and this was done 30 times each time a document was opened.

That, and another issue that I corrected, meant that I could massively improve the performance of the application for only about half a day’s worth of work.

screen-shot-2016-11-30-at-16-58-34

Feel free to use the tool, or if you can’t be bothered, hire me to solve your Notes Client performance issues.

P.S. I’ve just released a new version of the tool Notes RPC Parser, which you can download here:

Why do Software Developers always squawk when inheriting a foreign codebase?

The first thing that Software Engineers do when they inherit a codebase that they haven’t written is to squawk loudly.

This is terrible! That’s not the way to do it! We need to re-write this completely!

I recently had an experience when a project manager suddenly lost his developer and asked me to take over. ‘Here is the list of requirements from the customer, they’re just small changes’.

I said ‘OK, I’ll give this a look, doesn’t look too difficult‘. And then I started looking into the code, and I started grumbling more and more. There were no comments. No source control. No test code. The code I had in front of me what not working correctly, and I had no access to a codebase which was working. And within the code, there were so many times when I was thinking ‘Oh no, why would you do that? I’d really do it differently’.

So I promptly squawked. And the project manager expressed his incredulity. ‘It’s always the same with developers! They always complain of code that they haven’t written themselves!’.

So, here’s a little guide to explain why this happens.

We’re not masons.

That’s the crux. The job of building walls is well defined, and if a wall is not complete, you can hire another mason to finish the job. He’ll come to the site and the tools that he sees are familiar and he can start right where the previous person stopped. Easy.

That’s not how it works with software. There is a wide range of ways you can solve something, the wider since we basically can write your own tools. Another persons’ building site is going to be completely unfamiliar.

So here are my insights for non-developers, in the hope that you will get over the impression that we are whiners, and understand that software development is not masonry.

Software is easier to write than to read.

After about six months, even the code you have written yourself becomes more difficult to read. A good developer will be writing his code not just well enough for the machine to understand (bad developers stop at that point), but well enough for humans to understand.

This is a pretty accurate picture of what it’s like reading other people’s code from somebody else:

http://i.imgur.com/UGv2WaB.png

Software can be really badly written, and you can’t see it from outside.

This rejoins the points I was making in a previous post. Here we’re talking about maintainability, the ease with which a particular codebase can be modified.  Here are some best practices which you should insist that your developers follow:

  • One is to separate the responsibility of code into little independent boxes, which only do one specific thing. If you need to change something, there is only one place to look for. This is called modularity and follows the concept of separation of concern.
  • Another one is having Source Control. This shows all the changes that have been done on the codebase over time, including who did it, and enables us to revert code to what it was in the past (when a bug has been inserted, for instance.)
  • Automated tests make every small bit of the code run through its logic, often by trying to make it break. There is one thing that we software developers are scared of, and which happens a lot: we change a bit of code here and then notice that loads of other bits of code are no longer working. Automated tests enable you to be able to make bold changes because you immediately get feedback on whether you’ve broken something.
  • Finally, documentation really helps. And please note that I’m not talking about some separate Word Document that was once done for the client, I’m talking about comments explaining why things were done in a particular way embedded within the code.

If a software developer inherits a codebase which didn’t follow these guidelines, then that code is going to be a pain to maintain and to change. And as a developer, it’s a large source of frustration because we are in effect far slower at delivering changes than our predecessor, and our customer is not going to be impressed.

So, if you are a customer, request that these basics are followed, or else you’ve got a bad case of developer lock-in, where Bob is the only person who can maintain the code.

In conclusion:

If you are a customer and switch developers or provider for a project, you’ll have to factor in some budget and time just for the new developer to get acquainted with the project. You’ll also see that it will initially take longer to effect changes, depending on how well and maintainable the software was written in the first place.

 

Software is an iceberg. You’ll get hurt by the submerged parts.

Software Quality is an iceberg.

The tip of the iceberg is the small part sticking out that everybody can see. A non-IT person judges software with a far smaller subset of criteria than an IT professional: Functionality, performance, and usability.

To a normal user, the interface is the software. I regularly have to choke off exasperated comments when a customer tell me ‘but it’s only one extra button‘.

Lurking beneath the murky waters are those criteria which only IT professionals see: reliability, maintainability, security.

End-users should not be expected to see these ‘underlying’ aspects, but if the decision-makers (i.e. the ones with the purse string) don’t see under the water, you’re doomed to produce bad software. Tell-tale signs are:

  • There is no budget for IT maintenance
  • The only time fresh money is given is for new functionalities – money is never given for a functionality-free maintenance
  • Projects get stopped as soon as the software is barely out of beta testing. Finally, as a programmer, you’ve produced software that works, but in your mind, this is just a first version. There are lots of bits and bobs of useless code left in your software, debug code that writes to system.out, the object and data structure is a mess and could be optimised, you’ve yet to apply a profiler to your code to see where are the bottlenecks. And then the project gets stopped: ‘it works well enough’. That is very upsetting, and possibly one of the main reasons that in-house IT jobs are far less attractive than IT jobs within software companies.

If you recognise yourself as a company, then you should give more power to the IT department. Too many companies treat IT as a necessary, needlessly costly evil. IT should be an essential, central part of your company strategy. There is no other region of a company’s activity which gives you more of a competitive edge.

Too many companies (and I am looking at banks and insurances in particular) have failed to see that they are, essentially, a software company. You need to have a CIO who understands his trade, and accept that he is going to spend money for stuff which doesn’t make sense to you, or at least doesn’t seem to change anything.

If you are only ready to pay for the IT stuff which is visible – more functionality, for instance, but no money for a refactoring exercise which would add stability but not change the functionality, the accumulated technical debt is going to bite you at some point. It might be that an application gets exponentially difficult to change, or it might be that you have a massive data loss that you can’t recover from because the backup software was not up-to-date and nobody did an actual restore as part of regular maintenance.

If you are a developer, and you find yourself in such an environment, you’ll have to go guerilla. You’ll need to surreptitiously improve your code whenever you have the opportunity. Try to leave any code you touch a little bit better than before. Notice duplications and remove them. If you didn’t understand part of a code, then document it after you have understood it. If variable names are unclear, then rename them to something better. It’s an thankless occupation, because there is the risk that you break things that were working, and nobody will really care that you reduced the codebase by half, but it will help you progress and it will increase the quality of your work.

IT is surprising in that often the perception of quality is completely at odds with real quality, because of this iceberg effect. I once used to work in an environment where colleagues were doing terrible things – changing code directly in the production environment, and only doing quick-and-dirty cosmetic patches, but were actually perceived by the customers as delivering excellent quality, because the response times were so fast.

getting git

(git is wonderful. In the same way that unix is an’operating system done right’, git is ‘source control done right’.

It has a surprising history, being basically the creation of Linus Torvalds, who decided that all the available source control solutions were useless, and so just wrote his own. That is fabulously cool.

This is a small article to introduce you to git.

Learning Curve

There definitely is a learning curve with git. At least that was the case for me.

For starters, I had to throw away some concepts which I thought were set in stone. Git forces you to think about code in a different way. It turns out that that way of thinking has large advantages.

The single biggest mental stumbling block for me was that I always considered the real code, the precious bit, to be the one I’m currently working on, and the code that is in source control system to be a backup. A byproduct, so to speak. You need to completely overturn this with git.

The real stuff, the precious stuff, is what is in your repository. What you are currently working with, your working directory, that’s something like a scrapbook. You’re trying things out, making changes here and there, but it’s all very temporary and it’s not where the real stuff is. It shouldn’t be frightening to overwrite your working directory, for instance.

Repository

Let me track back a while and introduce the three areas in a local git repository (I am not talking about a remote repository, I am talking about a local git repository, on your local hard drive, and blazingly fast).

The repository is where the history of your project is stored. It’s quite similar to a series of snapshots of your whole project. Each of these snapshots contain your whole project (technically, git is not saving the whole project but just saving the delta to your initial commit, but conceptually, think of these snapshots as whole copies of your project).

These snapshots are called commits. They are unique, and can be referred to in a number of different ways, either directly (by using its number, which is a SHA-1 hash) or relatively (three commits before the last one I did).

Each commit points to a parent commit. When you have two different commits pointing to the same parent, these two commits are the starting points of different branches. The two codestreams, from this branching point, starts having a life of their own, although it’s possible to merge different branches together at a later stage.

Decentralized weirdness

This is a good time to introduce another weirdness that took me some time to grok. There is no primacy between branches. It’s a delightfully anarchic system, no bosses – a little like the anarcho-syndicalist communes of ‘The Holy Grail’. There is no master control program, no centralized master system. One can define certain branches to be more ‘important’ than others, but it’s just a naming convention, it’s not something that is built in.

The same applies between your local repository and the remote repository. Git really doesn’t assume that one of them is more important than the other. It’s very disquieting, but liberating once you’ve understood it.

Working Directory

Next, your working directory. This is where your whole project is, where the code is, images, what have you. This is what gets ‘snapshotted’ each time one makes a commit. It could be possible to snapshot your whole project every time, i.e. every single file, but it turns out that that is a bad idea, because you end up saving too much and it gets really difficult to understand what exactly happened between two commits. I must admit to having done that at the beginning, because I was blocked in my ‘backup’ concept.

No, what you want to do is make lots of small commits, with each commit changing as few files as possible, as long as they pertain to one common change.

Staging Area

Enter the staging area. Really well named, this is where you determine which files you want to belong to the next commit. You’re not interested in the files that have changed only because something trivial like a timestamp has changed, and you’re not interested in unrelated changes either. You really want to encapsulate all the changes necessary for a single debugging or a single new feature into a commit.

Overview

The following diagram shows the basic logic: you add files to the staging area with the git add command, then you commit those added files with a git commit command. Bringing back a specific commit to the working directory is done with the git checkout command.

sublime_text_graphviz_C8OsfB

So, you’ve got this conceptually? Working directory, repository, staging area? Right. This is what you should work with.

I got very confused by how this is actually built. The repository and the staging area are stored in a folder called .git which is contained within the working directory.

This made my head wobble. Surely, not inside the directory? Surely, that way lies folly? Infinite recursions? Surely it should be external? But that’s how it’s built. Inside. It’s elegant once you start thinking about it but it made my head hurt.

Start with the command line

I would recommend staring with a command-line interface. I know that there are many GUI Interfaces out there (SourceTree being a particularly good one), but all they are doing is adding an interface on top of the command-line instructions.

I would recommend moving to a GUI once you’ve learnt the basics of git. Additionally, the command line is very helpful for noobs, as it notices when you have typed something that doesn’t make sense and makes a suggestion. Stackoverflow and the git documentation are the place to go if you get stuck.

http://gitref.org/ is a good, compact resource.

You can download the git command-line bash here.

Recommended Reading

This is a tip from Eric McCormick.

gitinpractise

Git in Practice [With eBook]

Useful Commands

Here I have listed those commands which I am using regularly and feel comfortable with:

Initializing

[code gutter=”false”]
git init
[/code]

First initialization, creates the magical hidden folder .git. This is where the actual repository is stored (see above). You’ll need to select the directory which you want to ‘track’

Staging

Staging means ‘saying these are the files I wish to commit’

The command for staging is

[code gutter=”false”]
git add <filename>
[/code]

alternatively you can use

[code gutter=”false”]
git add .
[/code]

which will add all the files in the current directory (. is the current directory)

Committing

[code gutter=”false”]
git commit –message=’Committing message’
[/code]

The convention is to write a message with multiple lines, a bit structured like an e-mail. The first line should be like the subject of the e-mail, the other lines are the ‘body’ of the e-mail.

Creating a remote repository:

[code gutter=”false”]
git remote
[/code]

shows the remote repository linked to the current repository. You’ll need to have created this beforehand on the github site.

I use github, so the example here is with github. First log into GitHub and create a repository there. Note the URL of the remote repository (there is a convenient button for this)

[code gutter=”false”]
git remote add origin <your remote repository URL>;
[/code]

Note: origin here is one of these conventions, it’s the default name of the remote repository. Don’t make the mistake of naming one of your remote branches origin. Remember, convention!

[code gutter=”false”]
git remote -v
[/code]

shows the urls linked to your remote repository.

[code gutter=”false”]
git remote show origin
[/code]

This command shows all the branches on the origin repository (the remote one, origin is its name by default)

 

 

Fetching

[code gutter=”false”]
git fetch
[/code]

fetches the data from the external repository but does not merge the data.

Pulling

[code gutter=”false”]
git pull
[/code]

fetches, then merges the data.

Pushing

[code gutter=”false”]
git push <remote> <local branch name>:<remote branch to push into>
[/code]

pushes the data from the local repository to the remote one.

Checkout (restore files or change branch)

[code gutter=”false”]
git checkout
[/code]

This switches branches or restores working tree files. It’s a bit of an uncomfortable command at first because it seems to be doing two completely different things. Actually, it’s doing the exact same thing – it’s overwriting your working directory with information from a particular commit. Your working directory is a scrapbook, remember?

I find this useful to avoid committing fluff (my technical term for files which have, technically, changed (a timestamp for instance) but which are not actually changes.) I add the files that I want to have committed with gid add, then I do a git commit, and then a

[code gutter=”false”]
git checkout — .
[/code]

to overwrite the fluff in the working directory.

If you notice you’ve just committed nonsense (in this example I replaced the whole About This Database with nothing), you want to restore the file to what it was before. Here comes git checkout for helping again:

[code gutter=”false”]
git checkout HEAD~1 — odp/Resources/AboutDocument
[/code]

HEAD~1 is a way to refer to a specific commit, i.e. ‘The commit on HEAD’ minus (~) one (1)

Conventions for naming files and directories:

dash dash

If you see before a filename, it’s a disambiguator saying ‘what follows are necessarily files’

dot

. is the current directory – so, everything.

Syntax for defining an entire directory with everything in it

odp/Resources/ would define that particular directory, along with anything inside it. Note the trailing slash!

Conventions for naming commits:

Naming conventions:

HEAD: Head is a pointer the commit that was pushed into the working directory last. Think of it as ‘commit corresponding to the current state of my working directory before I made any changes’.

<commit name>~1 The tilde here means ‘minus’, so this would be the previous commit.

Configuration options that are useful

In git Bash, I immediately increase the font size to something readable with our big screens, and then

[code gutter=”false”]
git config –global core.autocrlf false
[/code]

which avoids the annoying ‘lf/crlf’ comments, since I’m mostly working in a windows environment (sniff).
Do this before your initial commit! (see https://help.github.com/articles/dealing-with-line-endings/)

Setting up SSH keys with Git Bash

Since I’m lazy and don’t want to type in password and username every time I connect to git, here’s how to do it it cleanly via ssh.

[code gutter=”false”]
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
[/code]

Press Enter for default in which to save key enter a passphrase (a complicated one, alright?)

Add the keys to the ssh-agent (This is a background agent whose job is to always automatically enter your passphrase in. You should be aware that this makes your local physical machine a lot more vulnerable)

[code gutter=”false”]
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
[/code]

copy the PUBLIC key to the clipboard (windows)

[code gutter=”false”]
clip < ~/.ssh/id_rsa.pub
[/code]

or mac OS

[code gutter=”false”]
pbcopy < ~/.ssh/id_rsa.pub
[/code]

add the ssh key to your github account Test the keys with

[code gutter=”false”]
ssh -T git@github.com
[/code]

Ignoring certain files or directories

There are some occasions when you don’t want certain files to be tracked by git. Typically, there are for instance binaries of your source code, or hidden files generated by your ide.

You can see which files are not being tracked with this command:

[code gutter=”false”]
git ls-files –others –exclude-standard
[/code]
or alternatively,

[code gutter=”false”]
git add -A -n
[/code]

There are unfortunately many different ways to define the excluded files. There is a central configuration file, whose location one can find with

[code gutter=”false”]
git config –get core.excludesfile
[/code]

Within your project, there are two files that control the exclusions:

[code gutter=”false”]
.gitignore
.git/info/exclude
[/code]

Confused? Here is an explanation from Junio Hamano (the maintainer of Git)

The .gitignore and .git/info/exclude are the two UIs to invoke the same mechanism. In-tree .gitignore are to be shared among project members (i.e. everybody working on the project should consider the paths that match the ignore pattern in there as cruft). On the other hand, .git/info/exclude is meant for personal ignore patterns (i.e. you, while working on the project, consider them as cruft).

This command combination is practical (thanks Akira Yamammoto)

[code gutter=”false”]
git rm -r –cached .
[/code]

(recursively remove all the cached files (i.e. tracked) from the current directory) then

[code gutter=”false”]
git add .
git commit -m "fixed untracked files"
[/code]

Notes/Domino best practices

Best practice when gitting an odp project: Never disconnect the nsf from the eclipse project.
Never change the nsf the odp is connected to.

Thanks to Adrian Osterwalder for the tips.