Friday, October 23, 2009

A Fairy, A Samurai and a Cowboy Walk Into A Casino… Day 3 of SharePoint Conference 2009

P1360876

My first session on Day 3 was the most mind-blowing one yet, I think.

The Speaker was Doron Bar-Caspi, a Sr. Program Manager with the SharePoint Customer Advisory Team (CAT).  The topic was best practices for geographically distributed SharePoint 2010 solutions, something that, working for a provider of hosted SharePoint, I find a very relevant topic, and one that was very challenging to address in SharePoint 2007. 

To start off, Doron provided some useful data points about latency expectations from different points around the globe.

P1360840 

I learned a lot of new things in this session.  It was a lot to take in.  I highly recommend watching Doron’s session on MySPC if you have access, so you can get even more detail than I have attempted to provide below, but here goes….

FSSHTTP

A new protocol (to me anyway) is FSSHTTP, which stands for File Synchronization via SOAP over HTTP (HTTP of course, still stands for hypertext transfer protocol).   This is a technique by which network traffic is reduced by the use of Office 2010 apps and SharePoint 2010, because the tools are able to just send the diffs back to the server when documents change, rather than the whole document, which would involve latency that is particularly painful for users in geographically distributed scenarios.

Visual Roundtrip Analyzer (VRA)

Parts of this demo, in which Doron demonstrated network performance and characteristics were performed using a utility called “Visual Roundtrip Analyzer” which can be downloaded from Microsoft.  A network emulator that was used will be part of Visual Studio Team System 2010.

Office Document Cache (ODC)

Something else new (to me) is the Office Document Cache, which sits on the client side and coordinates synchronizations back to the SharePoint Server, much the way your Outbox contents are asynchronously sent back to Exchange.  This is a very compelling feature of Office 2010 and SharePoint 2010, because it means that the client application can commit updates back to the server in an asynchronous manner, which means the user can trust the Office Document Cache to eventually sync with SharePoint, but in the meantime the user can continue to work in the client application without any tangible delay to wait for the file save to complete.  This is especially important on large files like those 100,000,000 row Excel files that are going to be possible in Office 2010.

The Office Document Cache works even if the user has taken the document offline and saved it back.  The updated copy is queued up, and then when connectivity is restored, the file is uploaded back to the server, but with the FSSHTTP feature that ensures that only the diffs actually go over the wire back to the server, again, all in the name of reducing latency over the wire.

As I said in a tweet upon learning about this feature today, I believe The Office Document Cache feature will tend to compel global enterprises to adopt Office 2010 as soon as possible.

But what about conflicts?  If the user can take the file offline, couldn’t someone else start editing it in parallel?  The answer is yes, they could, but the good news is that Microsoft has built multi-master conflict resolution of changes into the product.  I need to get a little more clarity on the exact rules here, but I understand at this point that if two copies of a file work their way back to the server, and one of them has an edit to paragraph A and the other has edits only in paragraph C, that both changes would be merged together to form a new version of the file in SharePoint.  That is intuitive and something I think end users will be able to understand.

Speaking of offline access to documents, Microsoft has rebranded the product formerly known as Groove to SharePoint Workspace.  Both the Office Document Cache and SharePoint Workspace synchronize with SharePoint upon reconnect.

But, wait, there’s more! 

Office Web Applications

The major Office applications, namely, Word, PowerPoint, Excel and One Note, come with browser-based clients, so you can now open your files in a browser-based version, which will provide a faster download than the full client application, another bonus for remote workers.

Mobile Device Performance Support

As I sat there furiously tweeting away on my Blackberry, Doron began to address the concerns of the increasing need to provide excellent support for mobile devices.  Is the mobile experience going to be equivalent to the PC experience?  Well, no, although it is easy to predict that we’ll see some fancy mobile apps coming out that support SharePoint in ways that build on top of the fundamental mobile support the platform is going to provide out of the box.

In geographically distributed models, you have varying network capabilities and performance of course, and it is predictable that that will continue to be a challenge indefinitely.  So, what Microsoft has done is build in options that make it possible to transmit the minimum amount of information over the wireless network, thereby keeping the annoying wait times as your mobile device pulls down content from SharePoint to a minimum.

So let’s say you are a user with a mobile device and you want to look through a document library that you know about, but you have to identify which file in that library is the right file.  In the pre-2010 world, out of the box, you’d have to download those files one by one, spinning up whatever application required to view the document, if you could in fact view the document at all.

In 2010 world, you can look through a plain text-based list that requires minimal data to cross the network to you.  From there, you have basically three options for what to do with the documents in that list.  You can open that, say, Word file, in a plain text version, which shrinks the size considerably, making it pass more quickly over the network.  This is great, but if you wanted to see the formatting (tables, fonts, images, etc.) in the document, plain text obviously can’t do that.  So, you can opt to pull down an image of the page in .jpg format, which is a compressed image file format of course, and that means a very small file traveling over the network to you. 

So, now you’ve been able to see the file well enough to confirm that it is indeed the right one, and that all happened quickly and with ease.  Now, you can “download a copy” of the file you have identified, and the full file comes across to you.  By giving you options to try before the full file download, you can look around for what you need much much faster.  I like this concept, and I can’t wait to see what phone app developers will do to make the mobile experience even richer and faster.

So now that you’ve got your geographically distributed solution serving your end uses all over the globe, and you’ve got it working efficiently in the branch offices and on a variety of devices, the question is how do you share services across all of that infrastructure?  What Microsoft has done to address this challenge is to break out Shared Service Providers into a new paradigm, in which these types of services are delivered via a mechanism called Service Applications. 

Service Applications

Service Applications, unlike Shared Service Providers, can be shared across farms.  This is a tremendous advantage because now you can use, for example, the Profile Store, or Search, across all of the farms in your solution.  If you’re wondering, did Jeff just say “All of the farms in my solution?!”   Yes, many larger organizations have multiple farms running, either with some sort of log shipping for DR or you have some form of 3rd party replication going on, or simply to have full farm capabilities located as close as possible to the target users in the right nodes on the WAN.

So what you can do in a geographically distributed model in SharePoint 2010 is deploy several farms, and distribute them around the globe as relevant, say one farm in North America, one in the EU and one in Asia-Pac.  But, you can optionally have a single service application supporting all of them.  You simply (okay, maybe simply is an overstatement of the relative ease of doing this, but it is doable) need to establish the service application in one “master” farm, and then you can “publish” that service to the other farms.

So now, you can do things with the various service applications, such as enforcing a taxonomy or other standards globally across multiple farms.  Also, associating web applications with services is more flexible now, and you can get better isolation, load balancing, etc.  Powerful stuff!

But wait, there’s more!

Uninterrupted Log Shipping

A common replication model for SharePoint DR and/or geographically distributed farms is SQL log shipping.  The problem has always been getting the log shipping to work with the DR or target database online and connected to the app while users are accessing it, because while the logs are being processed, the target database cannot be read.  In global deployments, you really don’t ever want the system to be offline, because users are accessing it 24/7. 

In SharePoint 2010, there is a new concept called Uninterrupted Log Shipping that basically enables you to actively work with two different databases in a single read-only farm.  What you can then do with PowerShell cmdlets is set up the read-only farm to process logs on one of the two read-only databases, while the SharePoint application is working with the other read-only database. 

Then, once the first database is done processing the logs, you switch the read-only farm over to it for read-only use by SharePoint, and the other database begins processing the logs from the master.  And the process repeats endlessly.  This technique enables you to avoid multi-minute outages at the read-only farm while logs are being processed, and the configuration is being altered to deal with the change in database connection.

As a result of the above, the logs can be updated continuously, and the user experience is not disrupted by the process.  The penalty of course is that the read-only farm will need twice the storage that the master farm has, because it is storing two full copies of the database at once.

Because the read/write farm might have sites being added and deleted while log shipping is going along under the hood, and the Search Service Application, for example, would be crawling content in parallel to the log shipping process as distinct service app instances, one running in each farm, your site map on the read-only farm is going to get out of sync with the changes coming in through the SQL log shipping eventually, and search could potentially return out of date, broken links.  So, whenever you are using the uninterrupted log shipping capability, as soon as the log processing has finished, you need to run a PowerShell cmdlet called

 SPContentDatabase.RefreshSitesInConfigurationDatabase();

and what this will do is make sure that the farm recognizes any changes that have occurred to the site structure.

Whew… That’s all, right?  Nope.  There’s more!

Windows 7 BranchCache

Let’s say you don’t want to have multiple farms.  You want your SharePoint content to all be centralized.  This is a common preference.

BranchCache - Distributed Cache Mode

It is typical when there is a centralized farm and remote users in a branch office, that the branch users connect to a VPN and then communicate back to the central server of a common WAN link, which therefore sees high utilization, making the SharePoint app feel slow to respond.

To help minimize traffic over that common link, Distributed Cache Mode, means that a document can be downloaded to one user in the branch office, which is a normal transfer of a file over the network.  Then, when a second user in the branch office requests that same file, the SharePoint 2010 server (running on Windows 2008 R2) knows that the the first user has already downloaded that file and is in the same branch, so it tells the second user’s machine to download that file from the first user’s machine, in a peer to peer paradigm.  This helps to ensure that the network traffic back to the central SharePoint is limited to the messaging about the request and the response that there is a copy already existing in the BranchCache.

BranchCache – Hosted Cache Mode

An alternative approach called Hosted Cache Mode works essentially the same way but without the Peer to Peer element.  In this scenario, a dedicated server is deployed to the branch office, and it maintains the BranchCache and the connectivity back to the central SharePoint farm.  When users in the branch make requests they go directly to the Central SharePoint Server, and the central server responds to the request with a unique id for the file requested.  The user’s machine then checks to see if that id is available in the cache, and if not, the file is first requested from the central server, and then while the user is using it, in the background, the user’s machine places a copy of that file into the cache server within the branch.

Then, when a second user in the branch makes a request for the same document, the central SharePoint server sends back the same id, and the second user’s machine recognizes that that id already exists in the cache, and thus, the file is served to the second user from the branch cache. 

You might ask why this involves a request to the central server at all if the file requested is already in the branchcache.  Why not simply start by querying the branchcache and if file not found there, only then go to the central server?  One reason is that if you did that, you would not have usage statistics on the files requested.  In this arrangement, the SharePoint server keeps seamless track of file requests, and logs all of that for reporting and analysis, even though the actual requested file is served to the end user from a machine located in the branch.

Wow. Wow. Wow.

As I left that session, I was pleased to see that some pastry and coffee were available in the halls, which for some reason was a rare treat at this conference.  Based on past Tech Eds and other conferences, I expected there to be plenty of refreshments in the major hallways during every break each day, but that was simply not the case at this show.  You generally had to walk all the way back to the exhibit hall to get even coffee, and good luck finding a free soda anywhere.  I know that Nintex provided very nice reusable water bottles, but when you need caffeine or some calories, that doesn’t quite fit the bill.  In future conferences, I would hope they go a little bit further with snacks and beverages.

Anyway, next session for me was Capacity and Performance Planning for SharePoint 2010, presented by Zohar Raz, Senior Program Manager, and Kfir Ami-ad, Senior Test Lead, from Microsoft.

P1360846

Zohar made it clear that Microsoft has been listening to the performance challenges people have been experiencing in SharePoint 2007, and they have put some new controls into SharePoint 2010 that will help with performance.  Performance reliability at scale is a “big bet” Microsoft made in SharePoint 2010.

To complicate matters, SharePoint Server 2010 is doing “more, more and more.” 

P1360841

There are three times as many services that can be enabled in the out of the box product, and each tier has more to do than it did before.

Some of the cool points in this sessions included the following:

  • Large scale solutions definitely require some strategic planning upfront.  These are situations where a consultant is advisable.
  • Since you can now have multiple farms with service applications published across all of them, it is now possible to isolate Search, for example, to its own farm, and federate it across all the other farms.
  • In scenarios where SharePoint is being used over the WAN, end users need to be on the latest browsers for the best performance, because IE8 can have 8 simultaneous connections, for example, distributing the burden of downloads and permitting the user to keep working.
  • Because the office client applications save back to SharePoint asynchronously in Office 2010/SharePoint 2010 scenarios, the client application remains functional during the save process, so the user does not feel the performance degrade.  Perception is huge, and this really helps the end user feel like things are moving along fast.
  • As far as large lists and such enhancements in both SharePoint 2010 and the client apps in Office 2010, Zohar cautioned against hitting all of those upper limits at once.  While tremendous scale is possible, you can still see problems depending on combinations of factors.
  • 100 GB is still a good rule of thumb for the largest size you want your content database to reach before you subdivide it.  This is especially true in situations involving heavy read/write, such as team collaboration sites.
  • Throttling is a key new feature that helps to maintain optimal performance.  Large list throttling is offered, as is the ability to throttle excessive client load.  Without throttling on, even just trying to delete a very large list can bring the SharePoint solution to a crawl for other users, especially in prime usage times.  But in SharePoint 2010, the IT administrator is in charge of such latency spikes, and can set windows of time ("happy hours”) during which large list functions are permissible, and during other times, can have SharePoint avoid giving too much resource to large list operations, so the vast majority of your users do not feel slowness because of what one user is doing.
  • Capacity management is a recurring cycle, not a task.  It requires that you scale and adapt to changing needs easily.  P1360847 This is one reason that virtualization technology can be your friend, as it allows you a great deal of flexibility to scale and adapt to changing performance conditions and cyclical demand periods, such as holiday shopping season, open enrollment periods, etc.
  • The fact that there is an extensible framework for the logging database is a big win for IT, because now you can write your own queries and more easily perform very specific forensic queries to figure out what is happening.
  • Microsoft has devised some standard architectures of different scales.
  • When you scale as high as the large farm architectureP1360850 , you are allocating servers to specific purposes within each tier.
  • System Center Capacity Planner is gone in SharePoint 2010.  A replacement has yet to be announced.
  • An attendee asked what to do if you are already over 100 GB in your content database.  The answer was you should first move to SharePoint 2010, and then deal with carving up your large site collections, because carving site collections up is much much easier in SharePoint 2010.
  • In SharePoint 2010, you can gradually delete a large site collection, meaning that the intense database hit that is required to achieve this does not happen in one large atomic action that would kill performance on the system until completed.P1360853

For lunch, I went to the HP session in Palm D.  HP described some very nice capabilities and tools they can provide to customers running SharePoint whether hosting with HP or on-prem, or through another hosting provider.  HP is a competitor in this space, and it was nice to hear how they present their offerings.  They are an impressive provider.

After that, a colleague in sales at work called me to let me know she had discovered that a customer we are speaking to about hosting SharePoint was actually attending the conference as well, and she had suggested that we meet.  So, I skipped the third session and had a terrific talk with the customer, who is thinking about a number of smart ways to move forward in a challenging project to rearrange his company’s IT portfolio and simplify maintenance and management of several solutions, including SharePoint.  Hosting of several applications, including SharePoint, is key to his strategy.  It was fun to do some real work and brainstorm collaboratively with a real customer about how some of what we’re learning at the conference can be applied to an actual project.

After that, I went into the Exhibit Hall and spent some time talking to some key vendors who I want to develop closer relationships with.  I talked also with the Microsoft Online Services team about how their offerings differ from ours. 

As presented at the show, I felt Microsoft made MOS sound a bit like the only option for IT shops looking to have their SharePoint hosted and managed, but as we discussed, there are a number of ways that MOS is simply not going to have an interest in providing services (including obvious ones such as when a customer wants to host all or a significant portion of their IT portfolio which includes Oracle apps or databases or PeopleSoft applications in their IT portfolio, and other cases such as where we already manage the customer’s entire network, and could provide a direct connection right into our data centers), coupled with the fact that we provide consulting services for SharePoint in-house as opposed to requiring a customer to contract with multiple vendors to get the whole project done.  So while I foresee steady growth for MOS with their standard model in particular, I’m sure that other hosting companies are likewise seeing specific areas where Microsoft Online Services will simply not provide a comprehensive solution for many enterprise customers, so I anticipate a thriving competition among several of the top hosting companies to provide great services in this area.

At this point, you’re probably wondering what the heck the title of this post is all about…  It’s no joke, there really was a SharePoint Fairy (she even merited her own hash tag on Twitter (Search #spfairy).  P1360873

It turns out, if memory serves, that she is in marketing at a firm in Atlanta called Unbounded Solutions, or for short, “USI,” which is funny, because the company I worked for before we were acquired was USinternetworking or “USi” for short.

Ironically, #spfairy was being interviewed, I believe, by the SharePoint Samurai himself, Mike @gannotti, with whom @ThunderLizard and I posed for this photo, which is suitable for framing if you’d like a copy.P1360874

Later, @ThunderLizard and I stopped by the Commerce Server 2009 kiosk on the exhibit floor and learned some specific details about how Commerce Services for SharePoint actually works right now in MOSS 2007.  We have clients who are looking for solutions that tie the SharePoint content management publishing model to e-Commerce sites, so this is an area of interest.  I am not clear just yet on how the Commerce Services platform will be adapted to fit the new SharePoint 2010 model, but at least the Commerce Services are pretty much all web parts as it is, so compatibility issues would probably be the main thing, and I expect Microsoft to address that issue with service packs, etc.

I learned from a post by a user on Twitter that the Hands On Labs for developers can be downloaded now from Microsoft’s website.

We watched a little bit of the Rock Band competition (“SharePoint Idol”), and grabbed a bite to eat and then headed over to the Ask The Experts area, where I spoke briefly to two of my favorite presenters at the show, Doron Bar-Caspi and Zach Rosenfield

We ran into Kirk Evans again there.  Kirk is the Microsoft Architect Evangelist who talked to me about shooting a video interview about my company’s fully managed hosting offering for SharePoint for his Channel 9 show “The Water Cooler.”  We spontaneously shot the video in about 10 minutes.  It’s working it’s way through the necessary approval channels, and should be posted soon.  I’ll post a note when it’s up, so you can take a look if you’re interested.  For now, check out this interview Kirk did at the conference with Tom Rizzo, Senior Director, SharePoint.

After that, the three of us went out to the Hofbrahaus for a great meal, drinks and a great sing-along band.

When we got back, we hung out with a few other attendees, including Eric, the “SharePoint Cowboy,” a very friendly and funny guy who I’ve been following on Twitter for months, but hadn’t met yet, at the House of Blues before I finally called it a night personally. 

P1360924

Hard to believe, but already we were down to just another half day to go…

No comments:

Post a Comment

What do you think?