databases, Scripting, System Center, Technology

What Not to Do With ConfigMgr, 1.0.1

[note: this post has been sitting in my drafts folder for over a year, but recent events reminded me to dust it off and post it]

One of my colleagues, the infamous @chadstech, sent a link to our team, to the slide deck from the Channel9 session (MS04) “30 Things you should never do with System Center Configuration Manager” by @orinthomas and @maccaoz. If you haven’t seen (or read) it already, I strongly recommend doing so first.

It’s from 2016, so even though it’s a few years old now, it still holds up very well in mid 2019. However, everyone who’s ever worked with that product knows that the list could become a Netflix series.

This blog post is not going to repeat the above; instead, append the list with some things I still see in a variety of environments today. Things which really should be nipped in the bud, so to speak. Baby steps.

Using a Site Server like a Desktop

Don’t do it. Install the console on your crappy little desktop or laptop and use that. Leave your poor server alone. Avoid logging into servers (in general) unless you REALLY need to perform local tasks, and that’s it. Anything you CAN do remotely, should be done remotely.

If installing/maintaining the ConfigMgr console is your concern: forget that. The days of having to build and deploy console packages are gone. Install it once, and let it update itself when new versions are available. Kind of like Notepad++. Nice and easy.

Why? Because…

  • Using a server as a daily desktop workspace not only drags on resources and performance.
  • It creates a greater security and stability risk to the environment.
  • The more casual you are with your servers, the sloppier you’ll get and eventually you’ll do something you’ll regret

Whatever your excuse has been thus far, stop it.

Anti-Virus Over-Protection

Even in 2019, with so many tools floating about like Symantec, McAfee, Sophos, CrowdStrike, and so on, when I ask if the “exclusions” are configured to support Configuration Manager, I often get a confused look or an embarrassing chuckle. Gah!!! Chalkboard scratch!

There are several lists of things to exclude from “real-time” or “on-demand” scanning, like this one, and this one. Pick one. Failing to do this VERY often leads to breaks in processes like application deployments, software updates deployments, and policy updates.

Also important: with each new release of Configuration Manager, read the release notes and look for new folders, log files, services or processes that may be introduced. Be sure to adjust your exclusions to suit.

Ignoring Group Policy Conflicts

Whatever you’re doing with regards to GPO settings, make damned sure you’re not also doing the same things with Configuration Manager. The two “can” be combined (in rare cases) to address a configuration control requirement, and you can sew two heads on a cow, but that doesn’t mean it’s the best approach.

Pick one, or the other, only. If you have WSUS settings deployed by GPO, and are getting ready to roll out Software Updates Management via Configuration Manager, stop and carefully review what the GPO’s are doing and make adjustments to remove any possible conflicts.

And, for the sake of caffeine: DOCUMENT your settings wherever they live. GPO’s, CI’s or CB’s in ConfigMgr, scheduled tasks, whatever. DOCUMENT THEM! Use the “Comments” or “Description” fields to your advantage. They can be mined and analyzed easily (take a look at PowerShell module GPODOC for example / shameless plug).

One-Site-Fits-All Deployments

I’ve seen places that only use packages, or only use Task Sequences, or only use script wrapping, or only repackage with AdminStudio (or some alternative). That’s like doing every repair job in your house or apartment with a crowbar.

There’s nothing wrong with ANY means of deploying software as long as it’s the most efficient and reliable option for the situation. Just don’t knee-jerk into using one hammer for every nail, screw, and bolt you come across.

Pick the right tool or method for each situation/application. Doing everything “only” one way is ridiculously inefficient and time-wasting.

Sharing SQL Instances

The SQL licensing that comes with a System Center license does not permit hosting third-party products. Not even your own in-house projects, technically speaking. You “can” do it, but you’re not supposed to.

What that means is, when you run into a problem with the SQL Server side of things, and you call Microsoft, and they look at it and see you have added a bunch of unsupported things to it, you’ll likely get the polite scripted response, “Thank you for being a customer. You appear to be running in an unsupported configuration. Unfortunately, we can’t provide assistance unless you are running in a supported configuration. Please address this first and re-open your case, if needed, for us to help? Thank you. Have a nice day. Bye bye now.

And, now, you’re facing an extended duration of what could have been a simple problem (or no problem at all, since your third-party app might be the problem).

Configuration Manager is extremely demanding of it’s SQL resources. Careful tuning and maintenance is VERY VERY VERY often the difference between a smooth-running site, and an absolute piece of shit site. I can’t stress that enough.

Leeching SQL Resources

Some 3rd party products, who I’m advised not to name for various legal reasons, provide “connection” services into the Configuration Manager database (or SMS provider). Attaching things to any system incurs a performance cost.

Before you consider installing a “trial” copy of one of those in your production environment, do it in a test environment first. Benchmark your environment before installing it, then again after. Pay particularly close attention to what controls that product provides over connection tuning (polling frequency, types of batch operations, etc.).

And, for God’s sake (if you’re an atheist, just replace that with whatever cheeseburger or vegan deity you prefer), if you did install some connected product, do some diagnostic checking to see what it’s really doing under the hood.

And just as important: if you let go of the trial (or didn’t renew a purchased license) – UNINSTALL that product and make sure it’s sticky little tentacles are also removed.

Ignoring Backups

Make sure backups are configured and working properly. If you haven’t done a site restore/recovery before, or it’s been a while, try it out in an isolated test environment. Make sure you understand how it works, and how it behaves (duration, results, options, etc. )

Ignoring the Logs

Every single time I get a question from a customer or colleague about some “problem” or “issue” with anything ConfigMgr (or Windows/Office) related, I usually ask “what do the logs show?” I’d say, on average, that around 80% of the time, I get silence or “hold on, I’ll check”.

If you ask me for help with any Microsoft product or technology, the first thing I will do is ask questions. The second thing I will do is look at the appropriate logs (or the Windows Event Logs).

So, when the log says “unable to connect to <insert URL here>” and I read that, and try to connect to same URL and can’t, I will say “Looks like the site isn’t responding. Here’s my invoice for $40,000 and an Amazon gift card”. And then you say “but I could’ve done that for free?!” I will just smile, and hold out my greedy little hand.

Keep in mind that the server and client logs may change with new releases. New features often add new log files to look at.

Check the logs first.

Ignoring AD: Cleanups

Managers: “How accurate is Configuration Manager?”

Answer: “How clean is your environment?”

Managers: (confused look)

If you don’t have a process in place to insure your environment is maintained to remove invalid objects and data, any system that depends on that will also be inaccurate. It’s just a basic law of nature.

Step 1 – Clean up Active Directory. Remove accounts that no longer exist. Move unconfirmed accounts to a designated OU until verified or removed. This process is EASY to automate, by the way.

Step 2 – Adjust ConfigMgr discovery method settings to suit your environment. Don’t poll for changes every hour if things really only change monthly. And don’t poll once a month if things really changes weekly. You get the idea. Just don’t be stupid. Drink more coffee and think it through.

Step 3 – I don’t have a step 3, but the fact that you actually read to this point brings a tear to my eyes. Thank you!

Ignoring AD: Structural Changes

But wait – there’s more! Don’t forget to pay attention to these sneaky little turds:

  • Additions and changes to subnets, but forgetting to update Sites and Services
  • Changes to domain controllers, but not updating DNS, Sites and Services or DHCP
  • Changes to OUs, but forgetting to update GPO links
  • All the above + forgetting to adjust ConfigMgr discovery methods to suit.

Ignoring DNS and DHCP

It’s never DNS!“, is really not that funny, because it’s very often DNS. Or the refusal to admit there might be a problem with DNS. For whatever reason, many admins treat DNS like it’s their child. If you suggest there might be something wrong with it, it’s like a teacher suggesting their child might be a brat, or stupid, or worse: a politician. The other source of weirdness is DHCP and its interaction with DNS.

Take some time to review your environment and see if you should make adjustments to DHCP lease durations, DNS scavenging, and so on. Sometimes a little tweak here and there (with CAREFUL planning) can clean things up and remove a lot of client issues as well.

Check DHCP lease settings and DNS scavenging to make sure they are closely aligned to how often clients move around the environment (physically). This is especially relevant with multi-building campus environments with wi-fi and roaming devices.

Task Sequence Repetition

A few releases back, Microsoft added child Task Sequence features to ConfigMgr. If you’re unaware of this, read on.

Basically, you can insert steps which call other Task Sequences. In Orchestrator or Azure Automation parlance this is very much like Runbooks calling other Runbooks. Why is this important? Because it allows you to refactor your task sequences to make things simpler and easier to manage.

How so?

Let’s say you have a dozen Task Sequences, and many (or all) of them contain identical steps, like bundles of applications, configuration tasks, or driver installations. And each time something needs updating, like a new application version, or a new device driver, you have to edit each Task Sequence where you “recall” it being used. Eventually, you’ll miss one.

That’s how 737 Max planes fall out of the sky.

At the very least, it’s time wasted which could be better spent on other things, like drinking, gambling and shooting guns at things.

Create a new Task Sequence for each redundant step (or group of steps) used in other Task Sequences. Then replace those chunks of goo with a link to the new “child” Task Sequence. Now you can easily update things in one place and be done with it. Easy. Efficient.

Ignoring Staffing

Last, but certainly not least is staffing. Typically, this refers to not having enough of it. In a few cases, it’s too many. If your organization expects you to cover Configuration Manager, and it’s SQL Server aspects, along with clients, deployments, imaging, updates, and configuration policies, AND maintain other systems or processes, it’s time for some discussion, or a new job.

If you are an IT manager, and allow your organization to end up with one person being critical to a critical business operation, that’s foolish. You are one drunk driver away from a massive problem.

An over-burdened employee won’t have time to create or maintain accurate documentation, so forget the crazy idea of finding a quick replacement and zero downtime.

In team situations, it’s important to encourage everyone to do their own learning, rather than depend on the lead “guru” all the time. This is another single point of failure situation you can avoid.

If there’s anyone who knows every single feature, process and quirk within Configuration Manager, I haven’t met them yet. I’ve been on calls with PFE’s and senior support folks and heard them say “Oh, I didn’t know that” at times. It doesn’t make sense to expect all of your knowledge to flow out of one person. Twitter, blogs, user groups, books, video tutorials, and more can help you gain a huge amount of awareness of features and best practices.

That’s all for now. Happy configuring! 🙂

Advertisements
Cloud, Scripting

Microsoft Teams and PowerShell

I just started playing around with the MicrosoftTeams PowerShell module (available in the PowerShell Gallery, use Find-Module MicrosoftTeams for more information). Here’s a quick sample of how you can get started using it…

$conn = Connect-MicrosoftTeams

# list all Teams
Get-Team

# get a specific Team
$team = Get-Team -DisplayName "Benefits"

# create a new Team
$team = New-Team -DisplayName "TechSupport" -Description "Technical Support" -Owner "dave@contoso.com"

# add a few channels to the new Team
New-TeamChannel -GroupId $team.GroupId -DisplayName "Forms Library" -Description "Forms and Templates"
New-TeamChannel -GroupId $team.GroupId -DisplayName "Customers" -Description "Information for customers"
New-TeamChannel -GroupId $team.GroupId -DisplayName "Development" -Description "Applications and DevOps teams"

# dump properties for one Team channel
$channelId = Get-TeamChannel -GroupId $team.GroupId |
Where-Object {$_.DisplayName -eq 'Development'} |
Select-Object -ExpandProperty Id

# add a user to a Team
Add-TeamUser -GroupId $team.GroupId -User "dory@contoso.com" -Role Member

Here’s a splatted form of the above example, in case it renders better on some displays…

$conn = Connect-MicrosoftTeams

# list all Teams
Get-Team

# get a specific Team
$team = Get-Team -DisplayName "Benefits"

# create a new Team
$params = @{
DisplayName = "TechSupport"
Description = "Technical Support"
Owner = "dave@contoso.com"
}
$team = New-Team @params

# add a few channels to the new Team
# NOTE: You could form an array to iterate more efficiently
$params = @{
GroupId = $team.GroupId
DisplayName = "Forms Library"
Description = "Forms and Templates"
}
New-TeamChannel @params

$params = @{
GroupId = $team.GroupId
DisplayName = "Customers"
Description = "Information for customers"
}
New-TeamChannel @params

$params = @{
GroupId = $team.GroupId
DisplayName = "Development"
Description = "Applications and DevOps teams"
}
New-TeamChannel @params

# dump properties for one Team channel
$channelId = Get-TeamChannel -GroupId $team.GroupId |
Where-Object {$_.DisplayName -eq 'Development'} |
Select-Object -ExpandProperty Id

# add a user to a Team
$params = @{
GroupId = $team.GroupId
User = "dory@contoso.com"
Role = 'Member'
}
Add-TeamUser @params

Scripting, System Center, Technology

Captain’s Log: cmhealthcheck

I’ve consumed way way waaaaay too much coffee and tea today. Great for getting things done, not great for my future health.

CMHealthCheck 1.0.8 is in the midst of being waterboarded, kicked, beaten, tasered and pepper-sprayed to make it squeal. I’m close to a final release. Among the changes in testing:

  • Discovery Methods
  • Boundary Groups
  • Site Boundaries
  • Packages, Applications, Task Sequences (just summary), Boot Images (summary), etc.
  • User and Device Collections
  • SQL Memory allocation (max/pct)
  • Fixed “Local Groups” bug
  • Fixed “Local Users” bug
  • Enhanced Logical Disks report
  • Fixed “Installed Software” sorting issue
  • Fixed “Services” sorting issue
  • Fixed null-reference issues with “Installed Hotfixes”

Still in the works:

  • Sorting issue with ConfigMgr Roles installation table
  • Local Group Members listing
  • More details for Discovery Methods
  • Client Settings
  • ADR’s
  • Deployment Summary
  • Enhancements to the HTML reporting features

Stay tuned for more.

Note: The current posted version (as of 3/8/19) is 1.0.7, which is what will install if you use Install-Module.

To load the 1.0.8 test branch, go to the GitHub repo, change the branch drop-down from “master” to 1.0.8 (or whatever the other name happens to be at the time) and then use the Download option to get the .ZIP file. Then extract to a folder, and use Import-Module to import the .psd1 file and start playing.

Projects, Scripting, System Center, windows

sktools

sktools2

UPDATE: 1/14/2019 – version 1901.13.2 was posted to address a problem with the previous upload.  Apparently, I posted an out-of-date build initially, so I’ll call this the “had another cup of coffee build”.

Dove-tailing from the previous idiotic blog post, I’ve taken some time off to retool, rethink, redesign and regurgitate “skattertools” as a single PowerShell module.  The new version blends PoSHServer into the module and removes the need to perform a separate install for the local web listener.  The first version of this is 1901.13.1 (as in 2019, 01=January 13th, 1st release).

How to Install and Configure sktools

  • Open a PowerShell console using Run as Administrator
  • Type: Install-Module sktools
  • Type: Import-Module sktools
  • Type: Install-SkatterTools (this creates a default “sktools.txt” configuration file in your “Documents” folder)
  • Type: Start-SkatterTools
  • Open your browser and navigate to http://localhost:8080

This next part is only temporary, and will be improved upon soon:

  • Once the web console is open, expand “Support” and click “Settings” and modify to suit your Configuration Manager site environment.
  • Close and reopen the PowerShell console (still “Run as Administrator”)
  • Type: Start-SkatterTools
  • Refresh your web browser session

Work will continue until morale is eliminated.  Easter eggs are included, sort of.  Thoughts, feedback, bug reports, enhancement requests, angry snarky comments, are all welcome.  Enjoy!

Personal, Projects, Scripting, System Center, windows

And now for another stupid pet project

First, there was project number one. I called it “WWA”, which was a clever short name for “Windows Web Admin”. Even though I kept hearing it sounded like a name for a wrestling tournament.  Anyhow, it fell over and sank into a swamp.

Then there was project number two, or “AppAdmin”, which almost fell over and sank, but it was built inside a big shipyard, and they don’t let things sink there, so it floated for a while (I’m told it’s still afloat somehow).

Then, there was project number three, but I can’t state it’s name for legal reasons, or because I promised it might result in me delivering a flaming box of dog poo to a certain someone’s porch, after they ruined that project just as it was maturing, but that’s for another time and place.

Then there was project four, but I can’t talk about that one either, so I’ll skip to project five, CMWT.  But nobody cares about that one, so number six, was putting a hand-rubbed wax finish on someone else’s PowerShell script, and tossing it up on GitHub and PowerShell Gallery, along with projects seven, eight and nine.  And I’m surprised I still remember how to spell the number 8.  So anyhow…

Announcing SkatterTools

(imagine Morgan Freeman narrating from here on)

What is it?

Skatterbrainz Tools.  A really clever name.

It’s a portable web console app thing, for viewing and modifying things in your Active Directory and Configuration Manager environments, from the comfort of your beer-stained laptop.  Think of it like CMWT if it were (A) trying to copy the concept from Microsoft Windows Admin Center, and (B) didn’t require using a separate “server” or anything special**.  Yes, those are double-asterisks.  That means there’s some hidden footnote down below, but don’t look yet, I have to finish boring the shit out of you with this part first.

Why is it?

Because I needed a break from other things, like family matters during the holidays, a dog that loves chewing on furniture, and a 20 year old cat that wanders the house at 3am making really weird sounds.  And I just wanted to see if it was possible to…

  • Build a 100% web console UX to interface with AD and ConfigMgr using PowerShell
  • Not have to touch IIS or any web hosting mess
  • Make it customize-able, free, and open-source
  • Make it through the holidays once again

Where is it?

  • Like a lot of my stuff, it’s up on GitHub

What can it do?

  • View and cross-link:
    • AD users, computers, groups, sites, sitelinks, domain controllers, OUs
    • ConfigMgr users, devices, collections, applications, packages, boot images, task sequences, updates, and scripts
    • ConfigMgr site status, queries, discovery methods, certificates, Forest publishing, boundary groups and boundaries
    • Software inventory, software files
  • Manage:
    • Add/remove AD group members
    • View computers by AD user profile paths
    • Add/remove ConfigMgr collection members **
    • Those damned double-asterisks again, hmm.

Installation and Setup

  • Download PoSH Server here and install it (don’t worry, I checked it and it seems safe, you can trust me, I worked for the government once, sort of)
  • Download the GitHub repo (big green button – top right – zip option)
  • Extract the “poshserver” folder from the GitHub download into a local path like C:\ThisIsTheDumbestShitEver
  • Open the “config.txt” file and edit the settings to suit your needs
  • It is now ready to blow your mind, almost

Starting it Up

  • Add some gasoline and finely-crushed road flares, oh, wait, wrong stuff…
  • Create a desktop shortcut named “Start SkatterTools”…
Target: powershell.exe Start-PoshServer -HomeDirectory "c:\ThisIsTheDumbestShitEver" -CustomConfig "c:\ThisIsTheDumbestShitEver\sktools.ps1"
  • Create another desktop shortcut, named “Open SkatterTools” or “Coolest Shit Ever!”…
Target: http://localhost:8080/
  • Right-click the first shortcut, select “Run as Administrator”, and wait for it to open and say something like this…
  • Double-click the second shortcut and have your Kleenex box nearby
  • Click on one of the sidebar headings and watch the slick CSS stylings ooze all over your eyeballs and onto the floor.  Compliments of some sample code I found on W3Schools.  What a great site.

If you need to shut it down, just close the browser and close the PowerShell console.  There’s instructions on the PoSH Server site for how to configure it like a service, so it runs as a background job. You don’t have to do that though.

Is there any official support?

  • Are you kidding?
  • You can submit bug reports and enhancement requests using the “Issues” link on the GitHub repo.
  • Work comes first.  I have to keep my customers happy and my bills paid
  • I’m still adding things to it frequently, but work may cause some delays getting around to it
  • You can submit your own changes via GitHub (pull requests, etc.) or just submit Issues if you prefer

Is there a roadmap?  Where is it going next?

  • Real (stupid) men don’t use maps!  The journey is the dream, man.
  • Where are any of us really going?  You ever ask that question?
  • Don’t ask that question, it’s depressing.  Enjoy the now.
  • Seriously, yes, I have a metric butt-ton of things I plan to add or improve

Double Asterisk-o-rama

  • Double-asterisks denote two things here:
    • Features are not yet complete.  Things will change.  Oceans dry up. Mountains wear down. Regimes are toppled.  Keith Richards is forever.
    • This is free stuff, and it comes without any strings attached.  No warranties, or guarantees.  No promises (other than it might possibly entertain you if you’re bored), and poor you gets to assume any and all liability, risk and responsibility for anything bad if you use it improperly or in a production environment of some kind, or any environment where alleged (love that word) damages may have occurred or been coerced by tertiary incidental hereinafters forthwith and notwithstanding, that are void where prohibited, taxed or regulated. Batteries not included.
    • Whatever that means

Cheers!

humor, Personal, Scripting, Technology

$HoHoHo = ($HoList | Do-HoHos -Days 12) version 1812.18.01

santa-riot

UPDATE: 2018.12.18 (1812.18.01) = Thanks to Jim Bezdan (@jimbezdan) for adding the speech synthesizer coolness!  I also fixed the counter in the internal loop.  Now it sounds like HAL 9000 but without getting your pod locked out of the mother ship. 😀

I’m feeling festive today.  And stupid.  But they’re not mutually exclusive, and neither am I, and so can you!   Let’s have some fun…

Paste all of this sticky mess into a file and save it with a .ps1 extension.  Then put on your Bing Crosby MP3 list and run it.

Download from GitHub: https://raw.githubusercontent.com/Skatterbrainz/Utilities/master/Invoke-HoHoHo.ps1

The function…

function Write-ProperCounter {
    param (
      [parameter(Mandatory=$True)]
      [ValidateRange(1,12)]
      [int] $Number
    )
    if ($Number -gt 3) {
        return $([string]$Number+'th')
    }
    else {
        switch ($Number) {
            1 { return '1st'; break; }
            2 { return '2nd'; break; }
            3 { return '3rd'; break; }
        }
    }
}

The bag-o-gifts…

$gifts = (
    'a partridge in a Pear tree',
    'Turtle doves, and',
    'French hens',
    'Colly birds',
    'gold rings',
    'geese a-laying',
    'swans a-swimming',
    'maids a-milking',
    'ladies dancing',
    'lords a-leaping',
    'pipers piping',
    'drummers drumming'
)
# the sleigh ride...
Add-Type -AssemblyName System.Speech
$Speak = New-Object System.Speech.Synthesis.SpeechSynthesizer

for ($i = 0; $i -lt $gifts.Count; $i++) {
    Write-Host "On the $(Write-ProperCounter $($i + 1)) day of Christmas, my true love gave to me:"
    $speak.speak(“On the $(Write-ProperCounter $($i + 1)) day of Christmas, my true love gave to me,”)
    $mygifts = [string[]]$gifts[0..$i]
    [array]::Reverse($mygifts)
    $x = $i + 1
    foreach ($gift in $mygifts) {
        if ($x -eq 1) {
            $thisGift = $gift
        }
        else {
            $thisGift = "$x $gift"
        }
        Write-Host "...$thisGift"
        $Speak.Speak($thisGift)
        $x--
    }
}

Enjoy!

Projects, Scripting, Technology

The Little (Code) Stuff That (Sometimes) Matters

As a follow-up to the post about tuning PowerShell scripts, this is going to be more general (language neutral).  I’d like to run through some of the “efficiency” or optimization techniques that apply to all program/script languages, due to how they’re parsed, and executed at the lowest layer of a conventional x86/x64 system.

Why?  Good question.  I’ve been digging into some of the MIT OpenCourseware content and it brought back (good) memories from college studies.  So I figured, why not.

Condition Prioritization

Performance isn’t mentioned as much these days outside of gaming or content streaming topics.  But processing any iterative or selective tasks that deals with larger volumes of data can still benefit greatly from some very simple techniques.

Place the most-common case higher in the condition tests.  This is also a part of heuristics, which is basically intuition or educated guessing, etc.  Using pseudo-code, here’s an example:

while ($rownum -lt $total) {
  switch ($dataset[$rownum].SomeProperty) {
    value1 { Do-Something; break; }
    value2 { Do-SomethingElse; break; }
    default { Fuck-It; break; }
  }
  $rownum++
}

Let’s assume that “value2” is found in 90% of the $dataset rows.  In this basic while-loop with a switch-case condition test, a small data set (chewed up into $dataset), won’t reveal much in terms of prioritizing the switch() tests.  Remember, that mess above is “pseudo-code” so don’t yell at me if it blows up if you try to run it.

Anyhow, what happens when you’re chewing through 400 billion rows of terabytes of data? The difference between putting “value2” above “value1” can be significant.

This is most commonly found with initialization loops.  Those are when you start with a blank or unassigned value, and as the loops continue, the starting value is incremented or modified.  There is often a test within the iteration that checks if the value has been modified from the original.  Since the initial (null) value may only exist until the first cycle of the iteration, it would make sense to move the condition [is modified] above [is not modified] since it skip an unnecessary test on each subsequent iteration cycle.

Make sense?  Geez.  I ran out of coffee 3 hours ago, and it almost makes sense to me.  Just kidding.

Sorted Conditions / Re-Filtering

Another pattern that you may run across is when you need to check if a value is contained within an Array of values.  For most situations, you can just grab the array and check if the value is contained within it and all is good.  But when the search array contains thousands or more elements, and you’re also looping another array to check for elements, you may find that sorting both arrays first reduces the overall runtime.  That’s not all, however.

What happens when the search value begins with “Z” and your search array contains a million records starting with “A”?  You will waste condition testing on A-Y.

What if you instead add a step within the iteration (loop) to essentially “pop” the previously checked items off of the search array?  So, after getting to search value “M”, the search array only contains elements which begin with “M” and on to “Z”, etc.

Figure 1 – Static Target Search Array

filtersearch1.png

Figure 2 – Reduction Target Search Array

filtersearch2

To help explain the quasi-mathematical gibberish above: S = Search Time, R = Array Reduction Overhead Time, N = Elements in Search Set.  So R+1 denotes the time incurred by calculating the positional offset, and moving the search array starting index to the calculated value.  Whereas, S alone indicates just starting each iteration on the first element of the (static) target array and incrementing until the matching value is found.

So, what does this look like with PowerShell?  Here’s *one* example…

Figure 3 – PowerShell sample code

param (
  [parameter(Mandatory=$False, HelpMessage="Pretty progressbar, but slower to run!")]
  [switch] $PrettyProgress
)
# build an array of ("A1","A2",...,"A100","B1","B2",...) up to 26 x 100 = 2600 elements

$searchArray = @()
$elementCount = 1000
$tcount = $elementCount * 26
$charArray = @()

cls

Write-Host "building search array..."
for ($i = 65; $i -le (65+25); $i++) {
  $c = [char]$i
  $charArray += $c
  for ($x = 1; $x -le $elementCount; $x++) {
     $cc = "$c$x"
     $searchArray += $cc
     if ($PrettyProgress) { Write-Progress -Activity "$($charArray -join ' ')" -Status "Building array set" -CurrentOperation "$c $x" -PercentComplete $(($m / $tcount) * 100) }
  }
}
# define list of search values...
$elementList = @("A50","C99","D75","K400","M500","T600","Z900")
$randomList  = @("T505","C99","J755","K400","A55","U401","Z960")

Write-Host "`nStatic search array"
foreach ($v in $elementList) {
  $t1 = Get-Date
  $test = ($v -in $searchArray)
  $t2 = Get-Date
  Write-Output "$v = $((New-TimeSpan -Start $t1 -End $t2).TotalSeconds)"
}

# protect the original target array for possible future use...
$tempArray = $searchArray

Write-Host "`nReduction search array"
foreach ($v in $elementList) {
  $t1 = Get-Date
  $test = ($v -in $tempArray)
  $t2 = Get-Date
  # this is the real "R"...
  $pos = [array]::IndexOf($tempArray, $v)
  $tempArray = $tempArray[$pos..$tempArray.GetUpperBound(0)]
  Write-Output "$v = $((New-TimeSpan -Start $t1 -End $t2).TotalSeconds)"
}

Figure 4 – PowerShell example process output

arraysearch1.png

The time values are in seconds, and will vary with each run depending upon thread processing overhead incurred by the host background processes.  But in general, the delta between the matched values in each iteration will be roughly the same.  To see this visually, here’s an Excel version…

Figure 4 – Spreadsheet table and Graph result

arraysearch2

It’s worth noting that the impact of R may vary by language, as well as processing platform (hardware, operating system, etc.) along a different vector than others, but that within the iteration tests, the differences should be roughly similar.

There are other methods to reduce the target array as well, which may depend upon the software language used to process the tasks.  For example, whether the interpreter or compiler makes a complete copy of the search array in the background in order to provide the index offset starting point to the script.

Again, this is all relatively meaningless for smaller data sets, or less complex data structures.  And it really only provides significant value for sequential (ordered) search operations, not for random search operations.

So, some questions might arise from this:

  1. If the source array is not sorted, does the sorting operation itself wipe out the aggregate time savings of the reduction approach?
  2. Where is the “tipping point” that would cause this approach to be of value?

These are difficult to answer.  The nature of the data within the array will have an impact I’m sure, as might the nature by which the array is provided (on demand or static storage, etc.) . To paraphrase a Don Jones statement: “try it and see.”

Now that I’m done pretending to be smart, I’m going to grab a beer and go back to being stupid.  As always – I welcome your feedback.  – Enjoy your weekend!